KIT-HYD / bridget

Evapotranspiration toolbox
https://KIT-HYD.github.io/bridget
0 stars 0 forks source link

Update read.py #6

Closed AlexDo1 closed 4 years ago

AlexDo1 commented 4 years ago

Added na_values (probably not the best way, but the only way I could get it running), resulting in the error: ValueError: Length mismatch: Expected axis has 54 elements, new values have 61 elements when using the read_TK2_file function to read in a csv file (as Mirko expected).

mmaelicke commented 4 years ago

If you can't make my changes work, just let me know and I will work on it

AlexDo1 commented 4 years ago

The changes for the usage of kwargs already work.

We are losing the columns with this line of code: data.dropna(axis=1, how='all', inplace=True) as some columns only consist of the number -9999.9003906 (replaced with NaN before) and so they are considered empty and are droped which leads to a mismatch with the given standard header values. Commenting out the dropna line leads to the same error as we have an empty last column in the raw data which has to be dropped, otherwise we have a column too much.

An easy workaround would be to delete the empty last column in the raw data but if this is a problem with all the expected input data files I should code soemthing that does that automatically.

mmaelicke commented 4 years ago

Yes, I would prefer the

code something that does that automatically

solution. Have you checked which columns are affected? Maybe we actually have to split them into mandatory and optional default columns.

mmaelicke commented 4 years ago

Could work like:

  1. Expand the default columns file with that 'NAN column', to make import and column header merging work.

  2. Remove the dropna line

  3. Explicitly drop the NAN column and import the others with possible NaN values

  4. If strict=True, drop all columns that have NA and check if the columns array is still correct.

AlexDo1 commented 4 years ago

Updated read.py and the default column file. By deleting the na_column at the end of _reader should make strict=True work like before.