Closed jvansijl closed 3 years ago
Thanks for pointing this out. I think your proposed change will work.
Could you send me the csv file of tube B22D0155 filter 1? Then I can check and run some tests. You can upload the file here on github
thanks Onno.
Back in the DINO api-days files could have any number of duplicate measurements because the data was originally measured on a shorter measurement frequency, but the datetime-index only had a daily resolution.
That is unfortunately no longer an issue, but I think we should still support returning the data exactly as is (including duplicates) if the user wants. The default should be to drop them, and perhaps a warning that measurements are being dropped is a good idea too.
EDIT: typos
I agree with Davíd that it is nice to have the option to return the data exactly as is (including duplicates). I fixed the error with duplicate indices. With the latest commit in dev you should be able to read the csv file. For now it will return the measurements with duplicate indices.
I will leave this issue open because I still want to create the option in read_dino_groundwater_csv
to drop the duplicates as suggested.
I added an optional argument to read_dino_groundwater_csv
to remove duplicate indices. I used the code suggestion from @jvansijl for this. Should be available in the dev branch now.
Issue: reading-in a Dino zipfile returns a ValueError:
ValueError: Shape of passed values is (3329, 9), indices imply (3327, 9)
For example in tube B22D0155 filter 1This occurs while reshaping in io_dino.py line 297
measurements = pd.concat([measurements, s], axis=1)
This filter has a duplicate time-index that is the probable culprit.
Since Dinoloket is in a frozen state (no more data will be added by TNO), perhaps we can change _read_dino_groundwater_measurements to accommodate?
proposed change from line 156 of io_dino.py: