MBeedle commented 2 years ago

Hi, OGGM Team:

I'm wanting to run the hydrological mass balance output for all glaciers in a particular watershed. Would you advise on how to apply the hydrological mass balance for more than one glacier? The tutorial seems to be only for a single glacier. I input multiple RGI IDs and get the following:

All Surprise Creek glaciers in RGI:

rgi_id = ['RGI60-01.04463', 'RGI60-01.04458', 'RGI60-01.04449', 'RGI60-01.04486', 'RGI60-01.04451', 'RGI60-01.04437', 'RGI60-01.04440', 'RGI60-01.04471', 'RGI60-01.04448', 'RGI60-01.04428', 'RGI60-01.04453', 'RGI60-01.04802', 'RGI60-01.04491', 'RGI60-01.04474', 'RGI60-01.04494', 'RGI60-01.04782', 'RGI60-01.04795', 'RGI60-01.04814', 'RGI60-01.04417', 'RGI60-01.04412', 'RGI60-01.04409']

We pick the elevation-bands glaciers because they run a bit faster

base_url = 'https://cluster.klima.uni-bremen.de/~oggm/gdirs/oggm_v1.4/L3-L5_files/CRU/elev_bands/qc3/pcp2.5/no_match'

gdir = workflow.init_glacier_directories([rgi_id], from_prepro_level=5, prepro_border=80, prepro_base_url=base_url)

2021-10-25 14:18:08: oggm.workflow: init_glacier_directories from prepro level 5 on 1 glaciers. 2021-10-25 14:18:08: oggm.workflow: Execute entity task gdir_from_prepro on 1 glaciers

ValueError Traceback (most recent call last) /var/folders/tt/2r0qw6453s19k3s3kgt5ffmw0000gn/T/ipykernel_1436/3594501057.py in 1 # We pick the elevation-bands glaciers because they run a bit faster 2 base_url = 'https://cluster.klima.uni-bremen.de/~oggm/gdirs/oggm_v1.4/L3-L5_files/CRU/elev_bands/qc3/pcp2.5/no_match' ----> 3 gdir = workflow.init_glacier_directories([rgi_id], from_prepro_level=5, prepro_border=80, prepro_base_url=base_url)

~/opt/miniconda3/envs/oggm_env/lib/python3.8/site-packages/oggm/workflow.py in init_glacier_directories(rgidf, reset, force, from_prepro_level, prepro_border, prepro_rgi_version, prepro_base_url, from_tar, delete_tar) 497 if cfg.PARAMS['dl_verify']: 498 utils.get_dl_verify_data('cluster.klima.uni-bremen.de') --> 499 gdirs = execute_entity_task(gdir_from_prepro, entities, 500 from_prepro_level=from_prepro_level, 501 prepro_border=prepro_border,

~/opt/miniconda3/envs/oggm_env/lib/python3.8/site-packages/oggm/workflow.py in execute_entity_task(task, gdirs, **kwargs) 159 '%d glaciers with multiprocessing turned off. OGGM ' 160 'will run faster with multiprocessing turned on.', ng) --> 161 out = [pc(gdir) for gdir in gdirs] 162 163 return out

~/opt/miniconda3/envs/oggm_env/lib/python3.8/site-packages/oggm/workflow.py in (.0) 159 '%d glaciers with multiprocessing turned off. OGGM ' 160 'will run faster with multiprocessing turned on.', ng) --> 161 out = [pc(gdir) for gdir in gdirs] 162 163 return out

~/opt/miniconda3/envs/oggm_env/lib/python3.8/site-packages/oggm/workflow.py in call(self, arg) 94 call_func, gdir = arg 95 if isinstance(gdir, Sequence) and not isinstance(gdir, str): ---> 96 gdir, gdir_kwargs = gdir 97 gdir_kwargs = _merge_dicts(self.out_kwargs, gdir_kwargs) 98 return call_func(gdir, **gdir_kwargs)

ValueError: too many values to unpack (expected 2)

Holmgren825 commented 2 years ago

Hi MBeedle,
There seems to be an extra set of brackets on this line

gdir = workflow.init_glacier_directories([rgi_id], from_prepro_level=5, prepro_border=80, prepro_base_url=base_url)

If your rgi_id is already a list of rgi_ids, you don't need to put it into another list. In the tutorial rgi_id is a single string, which is why it has to be put into a list for the gdir initialisation.

Regarding extending the tutorial to multiple glaciers, you simply pass the list of gdirs to the task run_with_hydro as in the tutorial and it will perform the task on all the glaciers.

MBeedle commented 2 years ago

Thanks much for your help. I greatly appreciate the quick, helpful responses from the OGGM community!

MBeedle commented 2 years ago

A follow up question:

The following does initialize my list of glaciers:

gdir = workflow.init_glacier_directories(rgi_id, from_prepro_level=5, prepro_border=80, prepro_base_url=base_url)

Passing this list of gdirs to run_with_hydro returns the following Attribute Error:

# file identifier where the model output is saved

file_id = '_ct'

tasks.run_with_hydro(gdir, run_task=tasks.run_constant_climate, nyears=100, y0=2014, halfsize=5, store_monthly_hydro=True, 

                     output_filesuffix=file_id);

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/var/folders/tt/2r0qw6453s19k3s3kgt5ffmw0000gn/T/ipykernel_995/562611711.py in <module>
      1 # file identifier where the model output is saved
      2 file_id = '_ct'
----> 3 tasks.run_with_hydro(gdir, run_task=tasks.run_constant_climate, nyears=100, y0=2014, halfsize=5, store_monthly_hydro=True, 
      4                      output_filesuffix=file_id);

~/opt/miniconda3/envs/oggm_env/lib/python3.8/site-packages/oggm/utils/_workflow.py in _entity_task(gdir, reset, print_log, return_value, continue_on_error, add_to_log_file, **kwargs)
    474 
    475             # Do we need to run this task?
--> 476             s = gdir.get_task_status(task_name)
    477             if not reset and s and ('SUCCESS' in s):
    478                 return

AttributeError: 'list' object has no attribute 'get_task_status'

In the tutorial using Hintereisferner, the initialization of gdir is:

gdir = workflow.init_glacier_directories(rgi_id, from_prepro_level=5, prepro_border=80, prepro_base_url=base_url)[0]

I can run this with my list of glaciers, but the [0] seems to be selecting a single glacier. I can modify this to [1], [2] . . . to select other individual glaciers in my list. The tutorial follows fine selecting any of the single glaciers in my list but I'm still struggling with errors when running run_with_hydro on the full list.

Your help is greatly appreciated. Sorry for my ineptitude.

Holmgren825 commented 2 years ago

To adapt the tutorial to work for multiple glaciers you first have to remove the index i.e. [0] when you initialise the gdir. At this point it could be a good idea to call it gdirs, to reflect that it is a list of glacier directories. Then, to execute an entity task on multiple gdirs, you make use of the execute_entity_task function:

workflow.execute_entity_task(tasks.run_with_hydro, gdirs, run_task=tasks.run_constant_climate, nyears=100, y0=2014, halfsize=5, store_monthly_hydro=True, output_filesuffix=file_id);

Then, to generate a dataset with outputs of all the glaciers you use the compile_run_output function from the utilites: ds = utils.compile_run_output(gdirs, input_filesuffix=file_id)

ds is an xarray dataset, have a look at the documentation for its functionality if needed!

MBeedle commented 2 years ago

Thanks for your quick response, Erik. This does help.

However, I'm still struggling to modify the syntax to follow the tutorial for multiple glaciers. I've gotten a few steps further but then run into further errors. My end goal is to produce output as in the hydro tutorial but for an entire watershed. Are there links in the OGGM documentation for applications with multiple glaciers? This would be helpful. Thanks for the xarray documentation - I will review this.

fmaussion commented 2 years ago

@MBeedle applying a task to multiple glaciers is different than to one glacier:

tasks.run_with_hydro(gdir, run_task=tasks.run_constant_climate, nyears=100, y0=2014, halfsize=5, store_monthly_hydro=True, output_filesuffix=file_id)

Has to be adapted to:

workflow.execute_entity_task(tasks.run_with_hydro, gdirs, run_task=tasks.run_constant_climate, nyears=100, y0=2014, halfsize=5, store_monthly_hydro=True, output_filesuffix=file_id)

I recommend to run a few of the other OGGM tutorials online to get a grasp of how this (rather complex workflow I admit) works. We work very hard on the documentation, but I can imagine that all this is confusing and we are learning in the process.

MBeedle commented 2 years ago

I've made some progress but am still struggling to apply run_with_hydro to multiple glaciers. In particular, I want annual runoff for multiple glaciers (the 'Peak Water' analysis run for Hintereisferner in the tutorial).

I'm working with 21 glaciers:

All Surprise Creek glaciers in RGI:

rgi_ids = ['RGI60-01.04463', 'RGI60-01.04458', 'RGI60-01.04449', 'RGI60-01.04486', 'RGI60-01.04451', 'RGI60-01.04437', 'RGI60-01.04440', 'RGI60-01.04471', 'RGI60-01.04448', 'RGI60-01.04428', 'RGI60-01.04453', 'RGI60-01.04802', 'RGI60-01.04491', 'RGI60-01.04474', 'RGI60-01.04494', 'RGI60-01.04782', 'RGI60-01.04795', 'RGI60-01.04814', 'RGI60-01.04417', 'RGI60-01.04412', 'RGI60-01.04409']

Here is what I'm trying, summing the runoff_vars for each of the 21 individual glaciers:

Create the figure

f, ax = plt.subplots(figsize=(18, 7), sharex=True)

Loop all scenarios

for i, rcp in enumerate(['rcp26', 'rcp45', 'rcp60', 'rcp85']): file_id = f'CCSM4{rcp}'

Open the corresponding data.

ds = utils.compile_run_output(gdir, input_filesuffix=file_id)

# Select annual variables
sel_vars = [v for v in ds.variables if 'month_2d' not in ds[v].dims]
# And create a dataframe
df_annual = ds[sel_vars].to_dataframe()

# Select the variables relevant for runoff.
runoff_vars = ds.sum('rgi_id')['melt_off_glacier'] + ds.sum('rgi_id')['melt_on_glacier'] + ds.sum('rgi_id')['liq_prcp_off_glacier'] + ds.sum('rgi_id')['liq_prcp_on_glacier']
# Convert to mega tonnes instead of kg.
df_runoff = df_annual[runoff_vars].clip(0) * 1e-9
# Sum the variables each year "axis=1", take the 11 year rolling mean
# and plot it.
df_runoff.sum(axis=1).rolling(window=11).mean().plot(ax=ax, label=rcp,
                           color=sns.color_palette("rocket")[i],
                           )

ax.set_ylabel('Annual runoff (Mt)') ax.set_xlabel('Year') plt.title(rgi_id) plt.legend();

pat-schmitt commented 2 years ago

Hi @MBeedle, thank you for your question. If I understand it right you want to sum up the annual runoff per glacier and you want to use the xarray.Dataset you get from ds = utils.compile_run_output(gdir, input_filesuffix=file_id). And this is possible with the xarray.Dataset.sum method (https://xarray.pydata.org/en/stable/generated/xarray.Dataset.sum.html), as you have already tried. But you must use another coordinate/dimension to sum over (the dimension given is the dimension which is summed over and hence disappear in the result). So when you look at ds (just tip ds) you see there are three main coordinates (in bold) time, rgi_id and month_2d and what you want as a result is a new dataset with the coordinates time and rgi_id. Therefore you must sum over month_2d. So maybe you should replace the line: runoff_vars = ds.sum('rgi_id')['melt_off_glacier'] + ds.sum('rgi_id')['melt_on_glacier'] + ds.sum('rgi_id')['liq_prcp_off_glacier'] + ds.sum('rgi_id')['liq_prcp_on_glacier'] with annual_runoff = ds.sum('month_2d')['melt_off_glacier'] + ds.sum('month_2d')['melt_on_glacier'] + ds.sum('month_2d')['liq_prcp_off_glacier'] + ds.sum('month_2d')['liq_prcp_on_glacier'] and use this new variable annual_runoff for plotting (for an plotting overview you can have a look at https://xarray.pydata.org/en/stable/user-guide/plotting.html). So for example to plot the annual runoff of all individual glaciers in one plot you can use annual_runoff.plot.line(ax=ax, x="time");.

I hope this helps you, if not maybe you can provide the entire code of the experiment.

OGGM / oggm

Hydrological Mass Balance For More Than One Glacier #1320

All Surprise Creek glaciers in RGI:

We pick the elevation-bands glaciers because they run a bit faster

All Surprise Creek glaciers in RGI:

Create the figure

Loop all scenarios

Open the corresponding data.