Open BSchilperoort opened 7 months ago
to not share the same resources on disk
For now i've left this choice up to the user but definitely something to document and make clear. I'd like to keep providing a single forcing object as a option for initial users who say want to use it for just a quick sensitivity analysis.
at least for parallel runs
Currently I've implemented parrallel runs only when running models in docker, for local models it seems to introduce too much overhead, but I still encountered the locking issue.
For now i've left this choice up to the user but definitely something to document and make clear.
I think having it as an option is OK, while by default using separate forcing files.
A work around could be to duplicate the forcing object in case just one is provided?
I would not call this a workaround, but rather the desired functionality.
Currently I've implemented parrallel runs only when running models in docker
Yes I would keep it that way. Local models should only be for developing/testing if the model works as ewatercycle plugin.
Should be an easy enough fix, will also ensure works nicely with DA.generate_forcing
, which allows for generation of forcinging in parrallel.
I'll discus today how much more dev i'll be doing towards the end of my thesis.
Edit: not much
but for a general solution it would be best that models do not share the same resources on disk.
So the issue was with using the same forcing object. But in generating the forcing I also had issues where if I had the same source .txt
file containing the timeseries, it would also fail:
...
File ".../ewatercycle_HBV/model.py", line 77, in _make_cfg_file
self.forcing.from_camels_txt()
File "..../ewatercycle_HBV/forcing.py", line 204, in from_camels_txt
ds = xr.Dataset(data_vars=df,
File ".../xarray/core/dataset.py", line 694, in __init__
variables, coord_names, dims, indexes, _ = merge_data_and_coords(
File ".../xarray/core/dataset.py", line 423, in merge_data_and_coords
return merge_core(
...
So even source resources can cause issues it seem...
edit: now tried with adjusting this so there is one txt file per forcing object. This works fine. From testing on Windows Subsystem Linux (not ideal), I still have a memory issue causing instability.
Edit edit: Seems to be a memory issue on WSL, when runnnig (slower) on surf, this doesn't seem to be an issue. IO is just hit first as it loads it into memory.
After more testing, it is a memory issue with WSL. Running bare ubuntu on my machine only used a fraction of the memory WSL consumed (which makes sense).
it is a memory issue with WSL
For this reason i'll leave it as is at the moment, but something to still look at. Thus moved the todo points up to your initial comment so it shows nicecly on git (sorry for editing your comment bart).
it is a memory issue with WSL
Glad that you figured it out!
For this reason i'll leave it as is at the moment, but something to still look at. Thus moved the todo points up to your initial comment so it shows nicecly on git (sorry for editing your comment bart).
No worries haha.
HBVForcing will have to change (long term todo) to accomodate this
We can also add a copy
method to the base class (ewatercycle's DefaultForcing). Forcings should be self contained in a directory, and should be movable (only relative paths), so you would just have to copy the directory to a new location.
for i in range(n_ensembles):
target_dir = forcing.copy(ensemble_forcings_dir / "ensemble_{i}")
which internally would call;
class DefaultForcing:
...
def copy(self, target: Path | str) -> Path:
return Path(shutil.copytree(src=self.directory, dst=target))
In https://github.com/eWaterCycle/leakybucket-bmi/issues/3 we debugged an issue related to resource locking when models are trying to access a shared file. In this specific case it is a model where it's possible to easily change the model code (as you're the creator of the model, @Daafip ), but for a general solution it would be best that models do not share the same resources on disk.
Therefore I propose that the default behavior of ewatercycle-da is to have separate forcing objects for separate models (at least for parallel runs).
Still todo: