Open OFR-IIASA opened 2 years ago
pandas has problems with:
df.dropna(how="all")
columns that have “left-over” empty columns, naming them Unnamed: {v}
where v is a counter…
This is how this is caught and treated in pyam: https://github.com/IAMconsortium/pyam/blob/c62ee8c368a9ee48447fce8a838410112298a3b2/pyam/utils.py#L178
The error occurs when executing read_excel().
When reading an excel file, some rows of data can be added which actually do not contain any data. These extra rows will contain "NaN"s, therefore resulting in two errors.
The first error which occurs, if the argument
add_unit=True
is passed:TypeError: No matching overloads found for at.ac.iiasa.ixmp.Platform.addUnitToDB(float,str), options are: public int at.ac.iiasa.ixmp.Platform.addUnitToDB(java.lang.String,java.lang.String) throws at.ac.iiasa.ixmp.exceptions.IxException
The error message results from the fact that a unit "nan" is trying to be added.
The second error occurs when reading data and determining the column type. Should "phantom" data be included in the dataframe, then columns which should be integers are read as floats. The following error is given:
RuntimeError: unhandled Java exception: The key '2001.0' must not contain a '.'!
The current workaround requires cleaning of the xlsx which is being read-in. In the various sheets, columns can be filtered for "Blanks" in excel. The resulting selection will show the respective rows, indicated by having blue row numbers. Deleting these and saving the xlsx file should resolve the issue. In order to see which parameter sheet contains the phantom data, add
print(name)
in line 274 of the fileixmp/backend/io.py
. The parameter sheet currently being processed will be printed on screen. The last parameter printed before the error message will guide the user to the sheet containing the phantom-data.