Open vhoogelander opened 1 year ago
Hi @vhoogelander thanks for opening this issue. Can you include the full error messages and provide details about the machine on which you're running these notebooks?
Hi Peter, I am running the notebooks on vhoogeland2@host-192-168-0-55 (this is the machine name right?). For the first problem, I don't get any error in the NB itself, it just keeps running. These are the full error messages of the 2nd problem: NB2:
Error Traceback (most recent call last)
Cell In[8], line 1
----> 1 cfg_file, cfg_dir = model.setup(end_time=experiment_end_date)
2 print(cfg_file)
3 print(cfg_dir)
File /opt/conda/envs/ewatercycle/lib/python3.10/site-packages/ewatercycle/models/wflow.py:113, in Wflow.setup(self, cfg_dir, **kwargs)
102 def setup(self, cfg_dir: Optional[str] = None, **kwargs) -> Tuple[str, str]: # type: ignore
103 """Start the model inside a container and return a valid config file.
104
105 Args:
(...)
111 Path to config file and working directory
112 """
--> 113 self._setup_working_directory(cfg_dir)
114 cfg = self.config
116 if "start_time" in kwargs:
File /opt/conda/envs/ewatercycle/lib/python3.10/site-packages/ewatercycle/models/wflow.py:160, in Wflow._setup_working_directory(self, cfg_dir)
157 self.work_dir.parent.mkdir(parents=True, exist_ok=True)
159 assert self.parameter_set
--> 160 shutil.copytree(src=self.parameter_set.directory, dst=self.work_dir)
161 if self.forcing:
162 forcing_path = to_absolute_path(
163 self.forcing.netcdfinput, parent=self.forcing.directory
164 )
File /opt/conda/envs/ewatercycle/lib/python3.10/shutil.py:556, in copytree(src, dst, symlinks, ignore, copy_function, ignore_dangling_symlinks, dirs_exist_ok)
554 with os.scandir(src) as itr:
555 entries = list(itr)
--> 556 return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks,
557 ignore=ignore, copy_function=copy_function,
558 ignore_dangling_symlinks=ignore_dangling_symlinks,
559 dirs_exist_ok=dirs_exist_ok)
File /opt/conda/envs/ewatercycle/lib/python3.10/shutil.py:512, in _copytree(entries, src, dst, symlinks, ignore, copy_function, ignore_dangling_symlinks, dirs_exist_ok)
510 errors.append((src, dst, str(why)))
511 if errors:
--> 512 raise Error(errors)
513 return dst
Error: [('/mnt/data/parameter-sets/wflow_merrimack_techpaper/inmaps/wflow_ERA5_Merrimack_2001_2016.nc', '/home/vhoogeland2/technicalPaperExampleNotebooks/ewatercycle_output/wflow_20230809_130831/inmaps/wflow_ERA5_Merrimack_2001_2016.nc', "[Errno 5] Input/output error: '/mnt/data/parameter-sets/wflow_merrimack_techpaper/inmaps/wflow_ERA5_Merrimack_2001_2016.nc' -> '/home/vhoogeland2/technicalPaperExampleNotebooks/ewatercycle_output/wflow_20230809_130831/inmaps/wflow_ERA5_Merrimack_2001_2016.nc'"),
........ VERY LONG MESSAGE ........,
'/home/vhoogeland2/technicalPaperExampleNotebooks/ewatercycle_output/wflow_20230809_130831/staticmaps/wflow_uparea.map', '[Errno 5] Input/output error')]
NB3:
OSError Traceback (most recent call last)
Cell In[6], line 1
----> 1 observations_df, metadata = ewatercycle.observation.grdc.get_grdc_data(
2 station_id,
3 start_time=experiment_start_date,
4 end_time=experiment_end_date,
5 )
6 grdc_obs = observations_df.rename(columns={"streamflow": "Observations from GRDC"})
7 grdc_lon = metadata["grdc_longitude_in_arc_degree"]
File /opt/conda/envs/ewatercycle/lib/python3.10/site-packages/ewatercycle/observation/grdc.py:107, in get_grdc_data(station_id, start_time, end_time, parameter, data_home, column)
104 raise ValueError(f"The grdc file {raw_file} does not exist!")
106 # Convert the raw data to an xarray
--> 107 metadata, df = _grdc_read(
108 raw_file,
109 start=get_time(start_time).date(),
110 end=get_time(end_time).date(),
111 column=column,
112 )
114 # Add start/end_time to metadata
115 metadata["UserStartTime"] = start_time
File /opt/conda/envs/ewatercycle/lib/python3.10/site-packages/ewatercycle/observation/grdc.py:129, in _grdc_read(grdc_station_path, start, end, column)
127 def _grdc_read(grdc_station_path, start, end, column):
128 with grdc_station_path.open("r", encoding="cp1252", errors="ignore") as file:
--> 129 data = file.read()
131 metadata = _grdc_metadata_reader(grdc_station_path, data)
133 all_lines = data.split("\n")
OSError: [Errno 5] Input/output error
And the 3rd problem:
NoSectionError Traceback (most recent call last)
Cell In[8], line 1
----> 1 reference = ewatercycle.models.PCRGlobWB(version="setters", parameter_set=experiment_parameterset)
3 reference_config, reference_dir = reference.setup(
4 start_date = experiment_start_date,
5 end_date = experiment_end_date)
7 print(reference_config, reference_dir)
File /opt/conda/envs/ewatercycle/lib/python3.10/site-packages/ewatercycle/models/pcrglobwb.py:47, in PCRGlobWB.__init__(self, version, parameter_set, forcing)
45 super().__init__(version, parameter_set, forcing)
46 self._set_docker_image()
---> 47 self._setup_default_config()
File /opt/conda/envs/ewatercycle/lib/python3.10/site-packages/ewatercycle/models/pcrglobwb.py:81, in PCRGlobWB._setup_default_config(self)
79 cfg = CaseConfigParser()
80 cfg.read(config_file)
---> 81 cfg.set("globalOptions", "inputDir", str(input_dir))
82 if self.forcing:
83 cfg.set(
84 "globalOptions",
85 "startTime",
86 get_time(self.forcing.start_time).strftime("%Y-%m-%d"),
87 )
File /opt/conda/envs/ewatercycle/lib/python3.10/configparser.py:1205, in ConfigParser.set(self, section, option, value)
1202 """Set an option. Extends RawConfigParser.set by validating type and
1203 interpolation syntax on the value."""
1204 self._validate_value_types(option=option, value=value)
-> 1205 super().set(section, option, value)
File /opt/conda/envs/ewatercycle/lib/python3.10/configparser.py:903, in RawConfigParser.set(self, section, option, value)
901 sectdict = self._sections[section]
902 except KeyError:
--> 903 raise NoSectionError(section) from None
904 sectdict[self.optionxform(option)] = value
NoSectionError: No section: 'globalOptions'
Hi @vhoogelander, actually I meant whether it's a research cloud machine. I'm guessing it's this one, right? https://ewatercyclestud.ewatercycle-tud.src.surf-hosted.nl
Yes, on that machine it looks like the /home
volume is full. That might explain the problem with NB2. NB3 looks different though, it's just reading, not copying.
Also it would be helpful if you could refer to the names of each of the notebooks (and where you got them from). Now I cannot really figure out which ones you have been running. It would be even better if you could reduce the problem to a minimal example and copy/paste the code here so we can reproduce it easily.
For now, I'll come back with a quick response.
import ewatercycle.observation.grdc
grdc_station_id = "6335020"
observations, metadata = ewatercycle.observation.grdc.get_grdc_data(
station_id=grdc_station_id,
start_time="1990-01-01T00:00:00Z", # or: model_instance.start_time_as_isostr
end_time="1990-12-15T00:00:00Z",
column="GRDC",
)
observations.head()
that worked without problems. Can you be more specific about what notebook/station ID etc you were using?
cat /mnt/data/parameter-sets/pcrglobwb_rhinemeuse_30min/setup_natural_test.ini
it does seem to contain that section.As you see, it would be helpful if you could be more specific about the issues you encountered.
On a side note: I did notice that the link to the example notebooks on the terria landing page is outdated. It currently points to link, but that no longer exists. We might need to pin it to a release or bring back the example notebooks in some other way. I'll open a new issue about that.
Hi @Peter9192, Thank you for your comment. I am referring to the technical paper notebooks (Case1_Marrmot_Merrimack..., Case2_wflow_LISFlood..., Case3_CoupleMarrmotAndPCRGlobWB and Case4_ForcePCRGlob) which I got via the terria landing page more than a year ago.
1) If I restart my server, the problem remains. Or is this not you mean with starting a new machine? If not, how can I do this? (maybe a stupid question)
2) I tried to clean up my own folder a bit, but the error of problem 2 remains. What is the maximum disk space of my home directory?
3) I re-ran the cell of NB3, but apparently I'm not getting an error anymore here for some reason. I was using the same station ID (6335020), so I am not really sure what was the problem here, but it seems to be fixed now ;).
4) I am loading this parameter set: name=pcrglobwb_merrimack_05min directory=/mnt/data/parameter-sets/pcrglobwb_global config=/mnt/data/parameter-sets/pcrglobwb_global/merrimack_05min_era5.ini I think this was the original dataset used in the Example Notebook, but I am not 100% sure.
It's not the jupyter server, it's the SURF research cloud machine (https://portal.live.surfresearchcloud.nl/) that should be updated (or make a new one). @RolfHut knows how to do this.
The parameter set does have a globalOptions
section, but I got a similar input/output error when I first tried to open it. It seems the disks were even fuller today than yesterday. The /home disk is 250GB in total shared by all users on that machine. I won't details here, but it looks like a few heavy users are taking up most of the available disk space.
I noticed that the dcache server that gives us files in /mnt/data
was having hickups and timeouts. This could cause weird file reading behavior.
I checked the Example Notebooks, and found some bugs:
[ ] 1.The main problem is still the issue with generating forcing using ESMValTool (see this issue). There seems to be something wrong with the temperature data. This is the log that I get from example NB1:
[ ] 2. When I run this in NB2:
And this in NB3:
I get an [Errno 5] Input/output error. (Is this related to the disk space?)
[ ] 3. In NB4, when I run this:
I get the following error: NoSectionError: No section: 'globalOptions'.