Soilgrids c - Githubissues

hboisgon commented 9 months ago

Issue addressed

Fixes #22

Explanation

Compute Brooks Corey for sbm soil layer with variable thicknesses. The weighted average is used for this.

Checklist

[x] Updated tests or added new tests
[x] Branch is up to date with main
[x] Tests & pre-commit hooks pass
[x] Updated documentation if needed

Additional Notes (optional)

I realized our tests were not were not coherent. In test_model_class, we did not check if the dims were still the same (name and values), and values in grid are test with np.isclose. In test_model_methods, we test on xarray.equal.

Because I had to update the c map, I realized the staticmaps of the example were really old, lat became latitude, lon -> longitude, exact values of lat, lon changed and wflow_dem values changed for 2 cells (not enough to be caught by isclose). I now fully replaced the staticmaps to be equal again to what is being produced. This meant that I also had to update floodplain_layers as the grid coordinates/names changed + values for two cells in wflow_dem.

I also updated to test on equal in test_model_class so hopefully we catch these differences sooner and a required update in the future of staticmaps should be less painful.

hboisgon commented 9 months ago

@JoostBuitink could you have a first look if this is what you need?

One question is that now the brooks_corey_layers function in soilgrids takes as input brooks corey computed at each soilgrids layer (computed from thetaS). So if you would like to update within a setup_irrigation function, you can update using the old c values from wflow (at depths 0,100,400,1200,2000) but this could be less precise than re-using soilgrids (soilgrids had more layers than wflow). Example:

da_c_updated = brooks_corey_layers(
     model.grid["c"], 
     soil_fn="soilgrids2020", 
     wflow_layers=[0,10,50,100,400,1200,2000], 
     soildepth_cm=[0,10,40,120,200],
)
# note: soilgrids2020 actaully mean that in model.grid["c"], c values are averaged over the soil layers rather than at depth (soilgrids)

I can still simplify a bit naming and units of the arguments for easy re-use outside of this function but let me know first if I'm going in the right direction.

hboisgon commented 9 months ago

Note: fixing the tests proved more complex than what I thought.... Seems that the staticmaps were not updated for a very long time mainly:

dims names for staticmaps are now latitude and longitude and were still lat lon in the examples --> this caused issues with the rivers/floodplains tests so I updated the file
when building / clipping a model in test_model_class, we check on isclose. While when testing in setup rivers/floodplains we test on equals --> apparently wflow_dem changed since not sure how long ago, so when building the new model, only the expected diff for c popped up. When I then replaced the whole staticmaps with the updated one, the the setup rivers/floodplain complained because hydrodem was not equal anymore due to a tiny change in 2 wflow_dem cells.

If we need to update staticmaps, the current process is very painful (also that we have to re-generate the linux files together with the windows ones). We also have to choose on a common strategy for isclose or equals for all our tests to avoid making it even more complex in the future to update the tests bench.

So far, I updated the tests by only replacing the c map in all staticmaps rather than changing the whole file. My preference would be to change the whole files (just copy pasting instead of extra actually not so straightfoward python scripting) but then we have to agree on a isclose or equals strategy.

@JoostBuitink and @dalmijn any thoughts?

JoostBuitink commented 9 months ago

One question is that now the brooks_corey_layers function in soilgrids takes as input brooks corey computed at each soilgrids layer (computed from thetaS). So if you would like to update within a setup_irrigation function, you can update using the old c values from wflow (at depths 0,100,400,1200,2000) but this could be less precise than re-using soilgrids (soilgrids had more layers than wflow).

@hboisgon, thanks for putting this together so quickly! I was thinking about this comment (as I did not directly realize the c values were directly linked to thetaS, sorry for overlooking that). I feel like it would be nicer that when a user wants to update the soil layers, the best possible estimate is being made. I think we have two options:

We force the user to rerun the full setup_soilmaps function
Or we ensure that thetaS is computed inside the brooks_corey_layers based on the soilgrids data. Perhaps allow the user to overrule this by using only the layers in the already existing wflow model. But I think this should be done with caution as it can indeed result in unrealistic results.

What do you think? Option 1 would be the least amount of work ;) But I think computing thetaS based on soilgrids data close/in the brooks_corey_layers function would be a bit nicer for the user (and quicker, as all the other soillayers are not affected).

hboisgon commented 9 months ago

One question is that now the brooks_corey_layers function in soilgrids takes as input brooks corey computed at each soilgrids layer (computed from thetaS). So if you would like to update within a setup_irrigation function, you can update using the old c values from wflow (at depths 0,100,400,1200,2000) but this could be less precise than re-using soilgrids (soilgrids had more layers than wflow).

@hboisgon, thanks for putting this together so quickly! I was thinking about this comment (as I did not directly realize the c values were directly linked to thetaS, sorry for overlooking that). I feel like it would be nicer that when a user wants to update the soil layers, the best possible estimate is being made. I think we have two options:

We force the user to rerun the full setup_soilmaps function

Or we ensure that thetaS is computed inside the brooks_corey_layers based on the soilgrids data. Perhaps allow the user to overrule this by using only the layers in the already existing wflow model. But I think this should be done with caution as it can indeed result in unrealistic results.

What do you think? Option 1 would be the least amount of work ;) But I think computing thetaS based on soilgrids data close/in the brooks_corey_layers function would be a bit nicer for the user (and quicker, as all the other soillayers are not affected).

Yes it's not very handy indeed... That's why one of the suggestion is to separate the computation of c in a new setup_brooks_corey function rather than setup_soilmaps so that then option 1 becomes less heavy. But anyhow for both option 1 and option 2 it means that in your setup_irrigation function you also need to add soil_fn to it. Would you be okay with this? Another option 3 is to also add thetas per layer in the staticmaps at the end of setup_soilmaps. I think for example we still save in staticmaps quite a lot of unused info and even ksatver for the different soil layers. Then you can still re-use this without having to reload soilgrids. So if this thetas is there you can use it and else you do need to re-read soilgrids and recompute thetas

hboisgon commented 9 months ago

After discussing with @JoostBuitink we go for option 2

Deltares / hydromt_wflow

Soilgrids c #242

Issue addressed

Explanation

Checklist

Additional Notes (optional)