ArtesiaWater / hydropandas

Module for loading observation data into custom DataFrames
https://hydropandas.readthedocs.io
MIT License
52 stars 10 forks source link

Update dino.py to changes in dinolokets .csv files #197

Closed Florisklooster closed 3 months ago

Florisklooster commented 3 months ago

The contents of the csv files from Dinoloket changed. There are no _0.csv and _1.csv versions anymore. The .csv files now contain kommas where there used to be empty rules (in the _1.csv files) and quatation marks where added to the files. Hereby my proposed changes

OnnoEbbens commented 3 months ago

Hi Floris, thanks for this very helpful PR!

Before we merge this I would like to add a few things:

  1. Add a test with a DINO csv file with the new format style, include a csv file with the new format in the test files.
  2. Keep the option to read DINO files with the old style, preferably with a keyword argument in the read_dino function (while the default should be the new lay-out).
  3. Use black and isort to format your code

If you want to you can add this to the PR, I can also add them if you prefer that.

Florisklooster commented 3 months ago

Unfortunately I am not that familiar with github yet, and I don't have to much time to figure it out right now, so if you can add it that would be wonderfull. I did reformat the code wiht black and isort.

So I checked the proposed changes on an old dinoloket zip file and the code seems to work fine as well. And I agree with point 2, so for that indeed the 'read_dino' function in the obs_collection.py should be changed, so that the default subdir and suffix are not subdir="Grondwaterstanden_Put" and suffix="1.csv", but subdir="DINO_Grondwaterstanden" and suffix=None

I think using None or somethinig similar (and under the hood look for .csv) is prefered because using '.csv' as substring is bound to make mistakes because '.csv' is ofcourse a substring of '_0.csv' and '_1.csv')

OnnoEbbens commented 3 months ago

Hi Floris, I was looking at your code and the new Dino style csv files. It would be nice to have the file you tested this on, especially a file with: 'vandezeputzijngeenstandenopgenomen'. Can you send this file? I think you can just attach it to a comment on this PR

Florisklooster commented 3 months ago

Sure, hereby a zip with one 'normal' file with data and one with: "Van deze put zijn geen standen opgenomen in de DINO-database",,,,,,,,,,,,

zip_pull_request.zip

OnnoEbbens commented 3 months ago

There is just one test failing which is not related to this PR