OpenDrift / opendrift

Open source framework for ocean trajectory modelling
https://opendrift.github.io
GNU General Public License v2.0
232 stars 113 forks source link

reader for NorFjords160 #1191

Open AndersOpdal opened 7 months ago

AndersOpdal commented 7 months ago

I have been using opendrift reading both the Nordic4km and NordKyst800 archive files from thredds.met.no. This has been working well. Now I am trying to read a test-archive for area A5 from NorFjords160, located at https://thredds.met.no/thredds/dodsC/romshindcast/NorFjords160/A5. Here, I encounter two problems.

  1. The archive consist of multiple nc-files, each covering a small time window. The reader cannot handle this.
  2. When trying to read just a single file using the following command:

reader_norfjords = reader_netCDF_CF_generic.Reader('https://thredds.met.no/thredds/dodsC/romshindcast/NorFjords160/A05/norfjords_160m_his.nc4_2020040101-2020040200')

I get the following message:


14:45:45 INFO opendrift.readers.reader_netCDF_CF_generic: Opening dataset: https://thredds.met.no/thredds/dodsC/romshindcast/NorFjords160/A05/norfjords_160m_his.nc4_2020040101-2020040200 14:45:45 DEBUG opendrift.readers.reader_netCDF_CF_generic: Finding coordinate variables. Traceback (most recent call last): File "", line 1, in File "C:\Users\st10873.conda\envs\opendrift\lib\site-packages\opendrift\readers\reader_netCDF_CF_generic.py", line 254, in init raise ValueError('No geospatial coordinates were detected, cannot geolocate dataset') ValueError: No geospatial coordinates were detected, cannot geolocate dataset


Is there any convenient way to get around these two problems?

AndresSepulveda commented 7 months ago

My advise would be to use NCO to create a single file.

Check this website for the commands

http://research.jisao.washington.edu/data/nco/

knutfrode commented 7 months ago

Hi, The Nordic and NorKyst datasets are postprocessed to be CF-compliant, but this NorFjords dataset is more native output from ROMS ocean model.

You can use the dedicated ROMS reader for this:

from opendrift.readers.reader_ROMS_native import Reader
r = Reader('https://thredds.met.no/thredds/dodsC/romshindcast/NorFjords160/A05/norfjords_160m_his.nc4_2020040101-2020040200')

Unfortunately there is no aggregate for this time series, but only individual, daily files with hourly data from 01H until 00H. You can create add one reader for each day, but there will then be a gap of data between 00 and 01 each day. If using hourly time step for the calculations, it will not be a problem.

You could download and merge files using NCO or CDO, which would "close the gaps", as OpenDrift will interpolate linearly in time within a given dataset - but not between independent datasets.

knutfrode commented 7 months ago

A note about the gap-problem: A pragmatic workaround could be to add NorKyst (800m resolution) as a backup reader AFTER adding the NorFjords readers. https://thredds.met.no/thredds/dodsC/sea/norkyst800m/1h/aggregate_be This will then be used in the gaps between 00 and 01, in case you are using finer time step than 1 hour. And for simulations at very fine scale you would probably like to have something like 15 minutes time step.

Also note that you may add many readers in one go with the method add_readers_from_list, and these can be either CF-generic or ROMS-native urls/files: o.add_readers_from_file([<filename/url1>, <filename/url2>, ...., <NorKyst aggregate url>])

AndersOpdal commented 7 months ago

Thank you for the good advice. I will use the NCO or CDO to merge the files locally. The functionality of several readers is clever. I have used that previously between NordKyst800 and Nordic4km. Also thanks for the suggestion of using 15 minutes time steps at 160 m resolution.

gauteh commented 7 months ago

You could also maybe do something like this: https://github.com/knutfrode/concepts/blob/main/Open_MFDataset_overlap.ipynb and send in the ready dataset - then it should appear as one big dataset. That works on the generic reader, but maybe we have not fixed that for the ROMS reader yet.

AndersOpdal commented 7 months ago

OK. I've tried a few things with varying success. My first plan was to install the NCO-package and concatenate the NorFjords160 nc-files. However, I could not make this work. I successfully installed the the nco-package through the miniconda terminal using conda install -c conda-forge pynco. This works, and I can find the package together with the other conda-packages, but calling it from python using from nco import Nco
does not work. I use a Windows machine, so there might be some issues there.

In the mean time, something wierd has happened, and now I cant even run simple simulations that used to work fine:

from opendrift.readers import reader_netCDF_CF_generic
from datetime import timedelta, datetime
from opendrift.models.oceandrift import OceanDrift
RNK = reader_netCDF_CF_generic.Reader('https://thredds.met.no/thredds/dodsC/sea/norkyst800m/1h/aggregate_be')
o = OceanDrift(loglevel=20)
o.add_reader(RNK)
o.seed_elements(lon=7.47, lat=61.43, z=100, number=1, time=datetime(2019,12,25))
o.run(duration=timedelta(hours=3), time_step=3600, time_step_output=3600)

which yield the following log: 14:55:37 INFO opendrift.models.basemodel: Fallback values will be used for the following variables which have no readers: 14:55:37 INFO opendrift.models.basemodel: ocean_vertical_diffusivity: 0.000000 14:55:37 INFO opendrift.models.basemodel: sea_surface_wave_significant_height: 0.000000 14:55:37 INFO opendrift.models.basemodel: sea_surface_wave_stokes_drift_x_velocity: 0.000000 14:55:37 INFO opendrift.models.basemodel: sea_surface_wave_stokes_drift_y_velocity: 0.000000 14:55:37 INFO opendrift.models.basemodel: sea_surface_wave_period_at_variance_spectral_density_maximum: 0.000000 14:55:37 INFO opendrift.models.basemodel: sea_surface_wave_mean_period_from_variance_spectral_density_second_frequency_moment: 0.000000 14:55:37 INFO opendrift.models.basemodel: surface_downward_x_stress: 0.000000 14:55:37 INFO opendrift.models.basemodel: surface_downward_y_stress: 0.000000 14:55:37 INFO opendrift.models.basemodel: turbulent_kinetic_energy: 0.000000 14:55:37 INFO opendrift.models.basemodel: turbulent_generic_length_scale: 0.000000 14:55:37 INFO opendrift.models.basemodel: Adding a dynamical landmask with max. priority based on assumed maximum speed of 1 m/s. Adding a customised landmask may be faster... 14:55:37 INFO opendrift.models.basemodel: Using existing reader for land_binary_mask 14:55:37 INFO opendrift.models.basemodel: All points are in ocean 14:55:37 INFO opendrift.models.basemodel: 2019-12-25 00:00:00 - step 1 of 3 - 1 active elements (0 deactivated) 14:55:37 INFO opendrift.models.basemodel: 2019-12-25 01:00:00 - step 2 of 3 - 1 active elements (0 deactivated) 14:55:37 WARNING opendrift.readers.basereader.structured: Data block from https://thredds.met.no/thredds/dodsC/sea/norkyst800m/1h/aggregate_be not large enough to cover element positions within timestep. Buffer size (7) must be increased. See Variables.set_buffer_size.

It doesn't seem to read anything from the reader-file. Am I missing something here? This simple code used to work fine.

knutfrode commented 7 months ago

Yes, Windows and Nco is a bit tricky. But I believe you don't need the Python bindings, as you typically run it from commandline. Here is some discussion regarding installation on Windows: https://sourceforge.net/p/nco/discussion/9830/thread/59fe446d/ ncks is typically the tool to use to download, subset and concatenate files: https://linux.die.net/man/1/ncks

Your sample lines work well for me. Btw, I recommend always using loglevel=0, then you will see more specificially what is going on (and going wrong).

AndersOpdal commented 7 months ago

OK. Great. I'm glad the code works. Then there is something else on my machine making problems. Yes, I normally also use loglevel=0. Not sure why it was set to 20 here.

thanks also for the tips on nco

knutfrode commented 7 months ago

Downloading and merging with is a bit cumbersome, and it might become quite a few GB. Thus you could consider one of the two suggestions above, 1) adding indivudual readers and NorKyst as backup to be used between 0 and 1 each day, or 2) the handmade-aggregate demonstrated in the link provided by Gaute

AndersOpdal commented 7 months ago

Indeed. I managed to to add all NorFjords160-readers using the add_readers_from_list, with all NorFords160-urls in a list with the NorKyst800-url at the bottom of the list. Or should I rather make a separate statement for adding the NordKyst-reader? E.g.

o.add_readers_from_list(list_name)
o.add_reader(NordKyst800)

However, this new (machine?) problem needs to be solved first.

knutfrode commented 7 months ago

What you did should be equivalent to adding NorKyst as the last element of the list.

You may run pytest in the main opendrift folder to check if there is something wrong with your installation. On Linux you simply type pytest from commandline, I guess it is the same on Windows.

AndersOpdal commented 7 months ago

Okey. Sorry for spamming the forum here with ignorant questions. I managed re-install the opendrift, and basic codes are now working again. The built in function for adding readers directly from a list of e.g. thredds-urls o.add_readers_from_list() appear to work well. However, the default setting lazy=True, in my case, for some reason causes the o.run() function only to read the last entry on the list. Setting lazy=False, fixes this issue.

As such, I can now easily add multiple NorFjords160 readers from a list:

NF_list=['https://thredds.met.no/thredds/dodsC/romshindcast/NorFjords160/A05/norfjords_160m_his.nc4_2019122401-2019122500',
         'https://thredds.met.no/thredds/dodsC/romshindcast/NorFjords160/A05/norfjords_160m_his.nc4_2019122501-2019122600']
o.add_readers_from_list(NF_list, lazy=False)

Running the model works fine, but does not interpolate between 00:00 hrs and 01:00 hrs between files. I.e. for 15 minute time steps it gives 4 identical outputs for this hour.

o.seed_elements(lon=7.47, lat=61.43, z=-200, number=1, time=datetime(2019,12,24,21))
o.run(duration=timedelta(hours=6), time_step=15*60, time_step_output=15*60)

Ideally, adding the NordKyst800 reader at the end of the list, or in the next line of code should mitigate this problem.

NK = reader_netCDF_CF_generic.Reader('https://thredds.met.no/thredds/dodsC/sea/norkyst800m/1h/aggregate_be')
o.add_readers_from_list(NF_list, lazy=False)
o.add_reader(NK)
o.seed_elements(lon=7.47, lat=61.43, z=-200, number=1, time=datetime(2019,12,24,21))
o.run(duration=timedelta(hours=6), time_step=15*60, time_step_output=15*60)

Indeed, the o.run() simulation uses the first reader on the NF_list between 21:00 and 00:00, after which it switches to the NK-reader. Whether it keeps reading from the NK-reader after 01:00, or continues on the next reader on the NF_list it not clear to me.

However, the "new" challenge is that the sea floor depth for NorFjords160 readers is ca 230 m at this position, while it is ca 140 m in the NK-reader . This causes the particle to be "lifted" from ca 200 m (the initial z) to ca 140 m at 00:00, and consequently remains there for the rest of the simulation. This clearly makes sense, considering there are no velocity-fields available deeper than 140 m in the NordKyst800 reader, but is somewhat inconvenient when tracking particles close to the bottom.