ArcticSnow / TopoPyScale

TopoPyScale: a Python library to perform simplistic climate downscaling at the hillslope scale
https://topopyscale.readthedocs.io
MIT License
39 stars 9 forks source link

Performing the downscaling code return ValueError. #62

Closed NILICK closed 1 year ago

NILICK commented 1 year ago

I tried to run example 2 to apply TopoPyScale. It downloaded climate data successfully, but when I tried to perform the downscaling step, it returned the below error. How can I solve it?

---> Downscaling climate to list of points using TopoScale
/.../Anaconda_Projects/Climate_DownScale/TopoPyScale/Examples/TopoPyScale_examples-main/ex2_romania_retezat/outputs/tmp cleaned
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
File ~/miniconda3/envs/downscaling/lib/python3.9/site-packages/xarray/core/concat.py:227, in concat(objs, dim, data_vars, coords, compat, positions, fill_value, join, combine_attrs)
    226 try:
--> 227     first_obj, objs = utils.peek_at(objs)
    228 except StopIteration:

File ~/miniconda3/envs/downscaling/lib/python3.9/site-packages/xarray/core/utils.py:188, in peek_at(iterable)
    187 gen = iter(iterable)
--> 188 peek = next(gen)
    189 return peek, itertools.chain([peek], gen)

StopIteration: 

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
Cell In[7], line 3
      1 # ========= STEP 4 ==========
      2 # Perform the downscaling
----> 3 mp.downscale_climate()

File ~/miniconda3/envs/downscaling/lib/python3.9/site-packages/TopoPyScale/topoclass.py:484, in Topoclass.downscale_climate(self)
    481             os.remove(file)
    483 else:
--> 484     ta.downscale_climate(self.config.project.directory,
    485                          self.toposub.df_centroids,
    486                          self.da_horizon,
    487                          self.ds_solar,
    488                          self.config.dem.epsg,
    489                          self.config.project.start,
    490                          self.config.project.end,
    491                          self.config.toposcale.interpolation_method,
    492                          self.config.toposcale.LW_terrain_contribution,
    493                          self.config.climate[self.config.project.climate].timestep,
    494                          self.config.climate.precip_lapse_rate,
    495                          self.config.outputs.file.downscaled_pt,
    496                          self.config.project.CPU_cores)
    498 self.downscaled_pts = ta.read_downscaled(
    499     self.config.project.directory + 'outputs/downscaled/' + self.config.outputs.file.downscaled_pt)
    500 # update plotting class variables

File ~/miniconda3/envs/downscaling/lib/python3.9/site-packages/TopoPyScale/topo_scale.py:199, in downscale_climate(project_directory, df_centroids, horizon_da, ds_solar, target_EPSG, start_date, end_date, interp_method, lw_terrain_flag, tstep, precip_lapse_rate_flag, file_pattern, n_core)
    196     ds_ = None
    197     ds_tmp = None
--> 199 ds_plev = _open_dataset_climate(flist_PLEV).sel(time=tvec.values)
    201 row_list = []
    202 ds_list = []

File ~/miniconda3/envs/downscaling/lib/python3.9/site-packages/TopoPyScale/topo_scale.py:161, in downscale_climate.<locals>._open_dataset_climate(flist)
    158 for file in flist:
    159     ds__list.append(xr.open_dataset(file))
--> 161 ds_ = xr.concat(ds__list, dim='time')
    162 # this block handles the expver dimension that is in downloaded ERA5 data if data is ERA5/ERA5T mix. If only ERA5 or
    163 # only ERA5T it is not present. ERA5T data can be present in the timewindow T-5days to T -3months, where T is today.
    164 # https://code.mpimet.mpg.de/boards/1/topics/8961
    165 # https://confluence.ecmwf.int/display/CUSF/ERA5+CDS+requests+which+return+a+mixture+of+ERA5+and+ERA5T+data
    166 try:
    167     # in case of there being an expver dimension and it has two or more values, select first value

File ~/miniconda3/envs/downscaling/lib/python3.9/site-packages/xarray/core/concat.py:229, in concat(objs, dim, data_vars, coords, compat, positions, fill_value, join, combine_attrs)
    227     first_obj, objs = utils.peek_at(objs)
    228 except StopIteration:
--> 229     raise ValueError("must supply at least one object to concatenate")
    231 if compat not in _VALID_COMPAT:
    232     raise ValueError(
    233         f"compat={compat!r} invalid: must be 'broadcast_equals', 'equals', 'identical', 'no_conflicts' or 'override'"
    234     )

ValueError: must supply at least one object to concatenate
ArcticSnow commented 1 year ago

Hey @NILICK ! Is the path in the config file been updated to your setup? It seems the problem is that it does not find files. Running the example as in the repo works for me.

cd to/your/project/path
# verify all the config settings as explained in the online documentation

# then run
python pipeline.py
NILICK commented 1 year ago

Hey @NILICK ! Is the path in the config file been updated to your setup? It seems the problem is that it does not find files. Running the example as in the repo works for me.

Hi @ArcticSnow, Thanks for your attention. After multiple trying to solve problem, I also suspected the path of the files, so I checked it but I think it is correct. In my local config.yml file as below, I changed directory and and also start and end date time. Is it necessary to change climate path in config file? Is this path path: inputs/climate/ correct?

project:
    name: Retezat
    description: Downscaling for the Retezat mountains
    authors:
        - Filhol S.
    date: Nov 2021
    directory: /mnt/4098F/Anaconda_Projects/Climate_DownScale/TopoPyScale/Examples/TopoPyScale_examples-main/ex2_romania_retezat/
    start: 2020-03-01
    end: 2020-06-01
    split:
        IO: False
        time: 1  # run indivudal batch in time
        space: None  # to be implemented
    extent: 
    CPU_cores: 4
    climate: era5

climate:
    era5:
        path: inputs/climate/
        product: reanalysis
        timestep: 1H
        plevels: [700,750,800,850,900,950,1000]
        download_threads: 12

dem:
    file: SRTM_90m_retezat.tif
    epsg: 3844
    horizon_increments: 10

sampling:
    method: toposub
    points:
        csv_file: pt_list_retezat.csv

    toposub:
        clustering_method: minibatchkmean
        n_clusters: 10
        random_seed: 2
        clustering_features: {'x':1, 'y':1, 'elevation':4, 'slope':1, 'aspect_cos':1, 'aspect_sin':1, 'svf':1}

toposcale:
    interpolation_method: idw
    pt_sampling_method: nearest
    LW_terrain_contribution: True

outputs:
    variables: all  # list or combination name
    file:
        clean_outputs: True
        clean_FSM: True
        df_centroids: df_centroids.pck
        ds_param: ds_param.nc
        ds_solar: ds_solar.nc
        da_horizon: da_horizon.nc
        landform: landform.tif
        downscaled_pt: down_pt_*.nc
ArcticSnow commented 1 year ago

climate path is in case your climate forcing are stored elsewhere than in the myproject/inputs/climate. So if you follow the project file structure, it should find the climate files there. Are your climate files in that folder? Just to make sure.

Then I do not quite know what it may be. You are running python 3.9, and installed xarray with conda, right? I am a bit puzzled.

NILICK commented 1 year ago

climate path is in case your climate forcing are stored elsewhere than in the myproject/inputs/climate. So if you follow the project file structure, it should find the climate files there. Are your climate files in that folder? Just to make sure.

Then I do not quite know what it may be. You are running python 3.9, and installed xarray with conda, right? I am a bit puzzled.

Yes, my downloaded climate files have been stored in inputs/climate folder. I created new environment in jupyterlab with python 3.9. But I installed all packages with pip not conda.

NILICK commented 1 year ago

I checked my xarray version installed:

(downscaling) :~$ pip show xarray
Name: xarray
Version: 2023.2.0
Summary: N-D labeled arrays and datasets in Python
Home-page: https://github.com/pydata/xarray
Author: xarray Developers
Author-email: xarray@googlegroups.com
License: Apache-2.0
NILICK commented 1 year ago

@ArcticSnow, I did setup a new environment with conda and I tried multiple times and now I found a new error as below:

No ERA5T  PRESSURE data present with additional dimension <expver>
Traceback (most recent call last):
  File "/mnt/864424144424098F/Anaconda_Projects/Climate_DownScale/TopoPyScale/Examples/TopoPyScale_examples-main/ex2_romania_retezat/pipeline.py", line 15, in <module>
    mp.downscale_climate()
  File "/home/nilik/miniconda3/envs/downscaling/lib/python3.9/site-packages/TopoPyScale/topoclass.py", line 484, in downscale_climate
    ta.downscale_climate(self.config.project.directory,
  File "/home/nilik/miniconda3/envs/downscaling/lib/python3.9/site-packages/TopoPyScale/topo_scale.py", line 199, in downscale_climate
    ds_plev = _open_dataset_climate(flist_PLEV).sel(time=tvec.values)
  File "/home/nilik/miniconda3/envs/downscaling/lib/python3.9/site-packages/xarray/core/dataset.py", line 2572, in sel
    query_results = map_index_queries(
  File "/home/nilik/miniconda3/envs/downscaling/lib/python3.9/site-packages/xarray/core/indexing.py", line 183, in map_index_queries
    results.append(index.sel(labels, **options))
  File "/home/nilik/miniconda3/envs/downscaling/lib/python3.9/site-packages/xarray/core/indexes.py", line 478, in sel
    raise KeyError(f"not all values found in index {coord_name!r}")
KeyError: "not all values found in index 'time'"
ArcticSnow commented 1 year ago

Can you open directly the files with xarray? It seems that the coordinate time is having a problem.

NILICK commented 1 year ago

Can you open directly the files with xarray? It seems that the coordinate time is having a problem.

Yes, I can open SURF and PLEV files directly with xarray, and for both files type, there is time as coordinate. Based on lines 149-161 in topo_scale.py file, I tested my SURF and PLEV files as below, and it seems that every thing is ok:

import xarray as xr import glob

flist_PLEV = glob.glob('PLEV*.nc') flist_PLEV.sort() flist_PLEV

['PLEV_202001.nc', 'PLEV_202002.nc', 'PLEV_202003.nc', 'PLEV_202004.nc', 'PLEV_202005.nc', 'PLEV_202006.nc', 'PLEV_202007.nc', 'PLEV_202008.nc', 'PLEV_202009.nc', 'PLEV_202010.nc', 'PLEV_202011.nc', 'PLEV_202012.nc']

ds__list = [] for file in flist_PLEV: ds__list.append(xr.open_dataset(file))

ds_ = xr.concat(ds_list, dim='time') ds

aa1

NILICK commented 1 year ago

New try: In topo_scale.py file in lines 149-150, I changed from:

flist_PLEV = glob.glob('inputs/climate/PLEV*.nc')
flist_SURF = glob.glob('inputs/climate/SURF*.nc')

to:

flist_PLEV = glob.glob(f'{project_directory}inputs/climate/PLEV*.nc')
flist_SURF = glob.glob(f'{project_directory}inputs/climate/SURF*.nc')

After that I got the below error, my files do not have expver dimension name.

---> Downscaling climate to list of points using TopoScale
/mnt/864424144424098F/Anaconda_Projects/Climate_DownScale/Caracal/Topopyscale_prj_khaeiz/outputs/tmp cleaned
No ERA5T PRESSURE data present with additional dimension <expver>
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[5], line 3
      1 # ========= STEP 4 ==========
      2 # Perform the downscaling
----> 3 mp.downscale_climate()

File ~/miniconda3/envs/downscaling/lib/python3.9/site-packages/TopoPyScale/topoclass.py:484, in Topoclass.downscale_climate(self)
    481             os.remove(file)
    483 else:
--> 484     ta.downscale_climate(self.config.project.directory,
    485                          self.toposub.df_centroids,
    486                          self.da_horizon,
    487                          self.ds_solar,
    488                          self.config.dem.epsg,
    489                          self.config.project.start,
    490                          self.config.project.end,
    491                          self.config.toposcale.interpolation_method,
    492                          self.config.toposcale.LW_terrain_contribution,
    493                          self.config.climate[self.config.project.climate].timestep,
    494                          self.config.climate.precip_lapse_rate,
    495                          self.config.outputs.file.downscaled_pt,
    496                          self.config.project.CPU_cores)
    498 self.downscaled_pts = ta.read_downscaled(
    499     self.config.project.directory + 'outputs/downscaled/' + self.config.outputs.file.downscaled_pt)
    500 # update plotting class variables

File ~/miniconda3/envs/downscaling/lib/python3.9/site-packages/TopoPyScale/topo_scale.py:199, in downscale_climate(project_directory, df_centroids, horizon_da, ds_solar, target_EPSG, start_date, end_date, interp_method, lw_terrain_flag, tstep, precip_lapse_rate_flag, file_pattern, n_core)
    196     ds_ = None
    197     ds_tmp = None
--> 199 ds_plev = _open_dataset_climate(flist_PLEV).sel(time=tvec.values)
    201 row_list = []
    202 ds_list = []

File ~/miniconda3/envs/downscaling/lib/python3.9/site-packages/xarray/core/dataset.py:2572, in Dataset.sel(self, indexers, method, tolerance, drop, **indexers_kwargs)
   2511 """Returns a new dataset with each array indexed by tick labels
   2512 along the specified dimension(s).
   2513 
   (...)
   2569 DataArray.sel
   2570 """
   2571 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "sel")
-> 2572 query_results = map_index_queries(
   2573     self, indexers=indexers, method=method, tolerance=tolerance
   2574 )
   2576 if drop:
   2577     no_scalar_variables = {}

File ~/miniconda3/envs/downscaling/lib/python3.9/site-packages/xarray/core/indexing.py:183, in map_index_queries(obj, indexers, method, tolerance, **indexers_kwargs)
    181         results.append(IndexSelResult(labels))
    182     else:
--> 183         results.append(index.sel(labels, **options))
    185 merged = merge_sel_results(results)
    187 # drop dimension coordinates found in dimension indexers
    188 # (also drop multi-index if any)
    189 # (.sel() already ensures alignment)

File ~/miniconda3/envs/downscaling/lib/python3.9/site-packages/xarray/core/indexes.py:478, in PandasIndex.sel(self, labels, method, tolerance)
    476     indexer = get_indexer_nd(self.index, label_array, method, tolerance)
    477     if np.any(indexer < 0):
--> 478         raise KeyError(f"not all values found in index {coord_name!r}")
    480 # attach dimension names and/or coordinates to positional indexer
    481 if isinstance(label, Variable):

KeyError: "not all values found in index 'time'"
NILICK commented 1 year ago

My last error shows that I have problem in time=tvec.values in lines 199 and 147 of topo_scale.py file. I think I could find the source of the error. I try the below code and it returns the previous error, so if I am in the right way it seems that my problem is about tvec.values.

import xarray as xr
import glob
import pandas as pd

tvec = pd.date_range("2020-01-01", pd.to_datetime("2021-01-01") + pd.to_timedelta('1D'), freq="1H", closed='left')
flist_PLEV = glob.glob('PLEV*.nc')
flist_PLEV.sort()

ds__list = []
for file in flist_PLEV:
    ds__list.append(xr.open_dataset(file))

ds_ = xr.concat(ds__list, dim='time')

ds_plev = ds_.sel(time=tvec.values)

ERROR:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[9], line 1
----> 1 ds_plev = ds_.sel(time=tvec.values)

File ~/miniconda3/envs/downscaling/lib/python3.9/site-packages/xarray/core/dataset.py:2572, in Dataset.sel(self, indexers, method, tolerance, drop, **indexers_kwargs)
   2511 """Returns a new dataset with each array indexed by tick labels
   2512 along the specified dimension(s).
   2513 
   (...)
   2569 DataArray.sel
   2570 """
   2571 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "sel")
-> 2572 query_results = map_index_queries(
   2573     self, indexers=indexers, method=method, tolerance=tolerance
   2574 )
   2576 if drop:
   2577     no_scalar_variables = {}

File ~/miniconda3/envs/downscaling/lib/python3.9/site-packages/xarray/core/indexing.py:183, in map_index_queries(obj, indexers, method, tolerance, **indexers_kwargs)
    181         results.append(IndexSelResult(labels))
    182     else:
--> 183         results.append(index.sel(labels, **options))
    185 merged = merge_sel_results(results)
    187 # drop dimension coordinates found in dimension indexers
    188 # (also drop multi-index if any)
    189 # (.sel() already ensures alignment)

File ~/miniconda3/envs/downscaling/lib/python3.9/site-packages/xarray/core/indexes.py:478, in PandasIndex.sel(self, labels, method, tolerance)
    476     indexer = get_indexer_nd(self.index, label_array, method, tolerance)
    477     if np.any(indexer < 0):
--> 478         raise KeyError(f"not all values found in index {coord_name!r}")
    480 # attach dimension names and/or coordinates to positional indexer
    481 if isinstance(label, Variable):

KeyError: "not all values found in index 'time'"
NILICK commented 1 year ago

I solved my problem, the source of the error was end date in the config file. Unfortunately, I enter the wrong date for the range of dates.

ArcticSnow commented 1 year ago

@NILICK Glad you came to figure it out. We could implement a check that the time range of downscaling is within the time period available in the forcing.

NILICK commented 1 year ago

@NILICK Glad you came to figure it out. We could implement a check that the time range of downscaling is within the time period available in the forcing.

@ArcticSnow, Thank you for your attention. It is a good suggestion for improving the package. I think there is another necessary improvement based on the above comment. I tested the below improvement suggestions in topo_scale.py :

lines 153-154

from:

flist_PLEV = glob.glob('inputs/climate/PLEV*.nc')
flist_SURF = glob.glob('inputs/climate/SURF*.nc')

to:

flist_PLEV = glob.glob(f'{project_directory}inputs/climate/PLEV*.nc')
flist_SURF = glob.glob(f'{project_directory}inputs/climate/SURF*.nc')

line 199

from:

ds_tmp.to_netcdf('outputs/tmp/ds_{type}_pt_{pt_id}.nc', engine='h5netcdf', encoding=encoding)

to:

ds_tmp.to_netcdf(f'{project_directory}outputs/tmp/ds_{type}_pt_{pt_id}.nc', engine='h5netcdf', encoding=encoding)

lines 233-234

from:

surf_pt_list.append(xr.open_dataset('outputs/tmp/ds_surf_pt_{i}.nc', engine='h5netcdf')) 
plev_pt_list.append(xr.open_dataset('outputs/tmp/ds_plev_pt_{i}.nc', engine='h5netcdf'))

to:

surf_pt_list.append(xr.open_dataset(f'{project_directory}outputs/tmp/ds_surf_pt_{i}.nc', engine='h5netcdf')) 
plev_pt_list.append(xr.open_dataset(f'{project_directory}outputs/tmp/ds_plev_pt_{i}.nc', engine='h5netcdf'))

lines 394-395

from:

down_pt = xr.open_dataset('outputs/tmp/down_pt_{}.nc'.format(str(pt_id).zfill(n_digits)), engine='h5netcdf')
surf_interp = xr.open_dataset('outputs/tmp/surf_interp_{}.nc'.format(str(pt_id).zfill(n_digits)), engine='h5netcdf') 

to:

down_pt = xr.open_dataset(project_directory + 'outputs/tmp/down_pt_{}.nc'.format(str(pt_id).zfill(n_digits)), engine='h5netcdf')
surf_interp = xr.open_dataset(project_directory + 'outputs/tmp/surf_interp_{}.nc'.format(str(pt_id).zfill(n_digits)), engine='h5netcdf') 
ArcticSnow commented 1 year ago

Thank you for the pointers @NILICK ! I will review those lines