Closed Mares2022 closed 3 weeks ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 87.96%. Comparing base (
65adefe
) to head (44eb596
). Report is 3 commits behind head on main.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Thanks for this PR, I do have some additional requests:
Adressing @veenstrajelmer requests.
[x] no data in the repository, in this case all csv/xls/json files. I have removed now the files and saved them in the Pdrive: P:\11210325-005-kenmerkende-waarden\work\data_hydraNL
[x] it is good to work with a single source of truth, this is complex when putting code in py as well as ipynb, I would suggest to remove the ipynb files. I have removed the KWK_process.ipynb. The KWK_geometries.ipynb has an interactive map that requires a notebook to be displayed. In case this code is not needed anymore I will erase it, but converting to .py cannot be done. Let me know if I should erase it.
[x] it is not clear to me where the csv files come from. I expect you create them from the hydra-xls files with one of the ipynb files (I have not looked at them). If this is indeed the case, it would be cleaner to just read the hydra-xls file directly. Files in .xls have symbols in a code that did not allow me to open them with pandas. The created excel files are also in an old format. The best approach based on my experience will be to create .csv files from the python Hydra-NL API, or request the developers to provide the data in these formats with standard code symbols. The csv were created manually in excel.
[x] minimize the diff (e.g. revert the change in KWK_getcheckdata.py since it does not contribute to this PR). I will take it back to the main branch version.
Files in .xls have symbols in a code that did not allow me to open them with pandas.
I have looked into reading this old xls format and found this solution that works:
import pandas as pd
file_xls = r"p:\11210325-005-kenmerkende-waarden\work\data_hydraNL\DENHDR.xls"
df = pd.read_table(file_xls, encoding='latin-1')
A new commit including df = pd.read_table(file_xls, encoding='latin-1') was created. Now we are able to read the original Hydra-NL files in .xls format.
Thanks for the updates! I have now looked into KWK_geometries.ipynb
and it can indeed be useful for this station selection (until we move to Hydra-RING API). However, could you trim it down so it only contains the code necessary for the interactive map? Also, you could avoid saving the geojson since the gdf variable is already being generated from the DDL catalog and you do not seem to use it elsewhere.
Issues
4 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
I have mainly modified KWK_process.py. I have added a function called set_hydra_nl_table() to set the Hydra-NL original tables into the dist_vali_exc dict dictionary. The set_table() function was also added to set validation data and the old exceedance frequency Hydra-NL function.
The pull request also includes the tables generated in Hydra-NL in .xls and .csv format for the stations available + Hydra-NL documentation.
The pull request has two jupyter notebooks, the first one called geometries was used KWK_geometries.ipynb was used to calculate the stations locations. However, this code is not working anymore after the changes made in the main source code for me. The same error is shown when running KWK_getcheckdata.py from this branch and from the main branch. Something to check later on to be sure there is no problem with the environments.
The second jupyter notebook KWK_process.ipynb can be ignored and not included in the main branch, as I only used it for testing purposes. I would like to keep the code anyway if possible in the main branch for future development.