OpenDrift / opendrift

Open source framework for ocean trajectory modelling
https://opendrift.github.io
GNU General Public License v2.0
249 stars 120 forks source link

Unable to add reader from a local file and import landmask #337

Closed mevo-creator closed 3 years ago

mevo-creator commented 4 years ago

Hello Team, First of all, Congratulations and thanks so much for developing easy to use open source trajectory modelling software. I'm new to python and hope i can use this software for my PhD research here at University of Stavanger in Norway. I've installed Opendrift and trying to run tutorials and examples. However, when running tutorials, I'm getting error when trying to add reader from a local file and also when trying to import landmask. Can someone please guide me about this issue. I'm pasting errors below. Thanks in advance!

-------------Error while trying to add reader from a local file--------------

reader_norkyst = reader_netCDF_CF_generic.Reader('norkyst800_16Nov2015.nc') 13:40:31 INFO: Opening dataset: norkyst800_16Nov2015.nc 13:40:31 INFO: Opening file with Dataset Traceback (most recent call last): File "C:\Users\mevo.conda\envs\opendrift\lib\site-packages\opendrift\readers\reader_netCDF_CF_generic.py", line 142, in init self.Dataset = Dataset(filename, 'r') File "netCDF4_netCDF4.pyx", line 2321, in netCDF4._netCDF4.Dataset.init File "netCDF4_netCDF4.pyx", line 1885, in netCDF4._netCDF4._ensure_nc_success FileNotFoundError: [Errno 2] No such file or directory: b'norkyst800_16Nov2015.nc'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "", line 1, in File "C:\Users\mevo.conda\envs\opendrift\lib\site-packages\opendrift\readers\reader_netCDF_CF_generic.py", line 144, in init raise ValueError(e) ValueError: [Errno 2] No such file or directory: b'norkyst800_16Nov2015.nc'

-------------------- Error when importing landmask--------------------

reader_landmask = reader_global_landmask.Reader( ... extent=[2, 8, 59, 63]) 13:43:39 DEBUG: Adding new variable mappings ERROR:root:could not verify read permissions for group and others on landmask. Traceback (most recent call last): File "C:\Users\mevo.conda\envs\opendrift\lib\site-packages\opendrift_landmask_data-0.6-py3.8.egg\opendrift_landmask_data\mask.py", line 77, in __check_permissions__ if not os.stat(self.lockf).st_mode & 0o777 == 0o777: FileNotFoundError: [WinError 2] The system cannot find the file specified: 'C:\Users\mevo\AppData\Local\Temp\landmask\.mask.dat.lock'

knutfrode commented 4 years ago

Hi,

This seems to be a recent bug for Windows, related to file permissions for the landmask. Can you try to delete any files in the folder C:\Users\mevo\AppData\Local\Temp\landmask\ (or simply the whole folder) and try again?

If it does not work, I will have a closer look at the code in a few days.

mevo-creator commented 4 years ago

Hi, Thank you for prompt reply. i am getting the below error after removing the folder.

reader_landmask = reader_global_landmask.Reader( ... extent=[2, 8, 59, 63]) ERROR:root:fcntl not available on this platform, concurrent generations of landmask (different threads or processes) might cause failing landmask generation. Make sure only one instance of landmask is running on the system the first time.

mevo-creator commented 4 years ago

Also need to inform that after deleting the landmask folder when i run the command "reader_landmask = reader_global_landmask.Reader(extent=[2, 8, 59, 63])", the landmask folder is created again and i can also see the file mask.dat file in the folder.

gauteh commented 4 years ago

Hi, the landmask is probably generated as it should. It is a message notifying that on windows it is not possible to lock the landmask generation. Normally this is only shown once. I think the permissions message is shown every time (on windows), but it does not stop the reader.

gauteh commented 4 years ago

Fixed the landmask error message in https://github.com/OpenDrift/opendrift-landmask-data/commit/a7479b0ff159791f5232387f530b4f1d15ad3acd, but have not made a new release yet.

mevo-creator commented 4 years ago

Hi, thanks for the reply. Can i somehow use the Opendrift version where this error is fixed but not release yet?

mevo-creator commented 4 years ago

When trying to run some examples, I'm getting the below error it says "FileNotFoundError: [WinError 2] The system cannot find the file specified: 'C:\Users\2915803\AppData\Local\Temp\landmask\.mask.dat.lock'". Do you still think that landmark should be generated as usual? Please let me know about this so that i can ignore the error for future simulations. Thanks!

17:01:31 INFO: Adding a dynamical landmask with max. priority based on assumed maximum speed of 1.3 m/s. Adding a customised landmask may be faster... ERROR:root:could not verify read permissions for group and others on landmask. Traceback (most recent call last): File "C:\Miniconda\envs\opendrift\lib\site-packages\opendrift_landmask_data-0.6-py3.8.egg\opendrift_landmask_data\mask.py", line 77, in __check_permissions__ if not os.stat(self.lockf).st_mode & 0o777 == 0o777: FileNotFoundError: [WinError 2] The system cannot find the file specified: 'C:\Users\2915803\AppData\Local\Temp\landmask\.mask.dat.lock' 17:01:36 INFO: Using existing reader for land_binary_mask 17:01:37 INFO: Moving 1 out of 2000 points from land to water 17:01:37 INFO: Oil-water surface tension is 0.022288 Nm

gauteh commented 4 years ago

Yes, you can ignore it. The error will disapear in the next version. The other error about the missing nc file you have to fix somehow. You can also use the unreleased version, but it is not necessary. See the docs for an example. You would have to do the same for the opendrift-landmask-data package.

ons. 15. jul. 2020, 17:50 skrev mevo-creator notifications@github.com:

When trying to run some examples, I'm getting the below error it says "FileNotFoundError: [WinError 2] The system cannot find the file specified: 'C:\Users\2915803\AppData\Local\Temp\landmask.mask.dat.lock'". Do you still think that landmark should be generated as usual? Please let me know about this so that i can ignore the error for future simulations. Thanks!

17:01:31 INFO: Adding a dynamical landmask with max. priority based on assumed maximum speed of 1.3 m/s. Adding a customised landmask may be faster... ERROR:root:could not verify read permissions for group and others on landmask. Traceback (most recent call last): File "C:\Miniconda\envs\opendrift\lib\site-packages\opendrift_landmask_data-0.6-py3.8.egg\opendrift_landmask_data\mask.py", line 77, in check_permissions if not os.stat(self.lockf).st_mode & 0o777 == 0o777: FileNotFoundError: [WinError 2] The system cannot find the file specified: 'C:\Users\2915803\AppData\Local\Temp\landmask.mask.dat.lock' 17:01:36 INFO: Using existing reader for land_binary_mask 17:01:37 INFO: Moving 1 out of 2000 points from land to water 17:01:37 INFO: Oil-water surface tension is 0.022288 Nm

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/OpenDrift/opendrift/issues/337#issuecomment-658834629, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAN366FA45POI3JMH26KJ3R3XF45ANCNFSM4O2PBTNQ .

mevo-creator commented 4 years ago

Thanks so much...appreciate your help! I'm planning to use Opendrift for simulation of polymers (so basically plastic) in the Northsea, so i plan to use Plastdrift. My plan is to first go through the tutorials and examples in the gallery to understand how model works and then use it for my application. I would like to run simulation for 3-5 years after discharge of polymers, hope it is possible to do this in Opendrift(Plastdrift). It would be helpful if you guys can guide me a bit along the way!

knutfrode commented 4 years ago

Yes, this is all possible with PlastDrift module. But you will need a lot of input/forcing data for such a long term simulation.

I believe the other problem with the missing netCDF-file is that the file is not in your work folder. This specific file may be accessed with absolute path as in several of the example scripts, e.g. https://opendrift.github.io/gallery/example_generic.html#sphx-glr-gallery-example-generic-py

mevo-creator commented 4 years ago

Hi, Thank you for your reply. I've some queries, it would be extremely helpful if you can guide me through them as per your convenience.

  1. I understand the need for a large amount of forcing data. Our plan is to simulate the discharge of polymers from near Stavanger/Bergen and track them between 1 year to 5 years or possibly more in the North Sea and the Norwegian Sea. It seems like they will end up somewhere in the Barents sea but not exactly sure. I found some input/forcing data on met.no (https://thredds.met.no/thredds/catalog/fou-hi/norkyst800m-1h/catalog.html), however, I think I need to combine these files for a long time scale. is this the best approach, it seems like the file size is reasonably large and it might become too big of a file for 3 years simulation. Please suggest the best approach.

  2. The polymers are discharged at a certain depth and therefore they will get distributed in the water column at very low concentration (perhaps in the range of nanogram/liter). In the configuration of the PlastDrift module, there are some configurations for vertical mixing. So is it possible to track polymers in 3 dimensions where when we seed the particles it gets randomly distributed in the water column? Also, I wonder what is the maximum number of particles that we can seed. Please let me know about this.

Finally, I must say that I was inspired by your presentation with some of your colleagues on "Simulated pathways of microplastics into the Arctic Ocean" that I found online. I saw some results for the distribution of plastics after 1-3 years. So I believe it could be possible to do this :)

Just would like to add that in case if it is not realistic to run the simulation for such a long time, it would also be helpful to run the simulation for a shorter time scale and in two dimensions.

Thanks again! P:S I just deleted some of the text, as I felt it became too long.

mevo-creator commented 4 years ago

Hi, can someone from the team please share their thoughts regarding the possibility of using Opendrift for long time scale simulation in line with my above queries...thanks so much!

knutfrode commented 4 years ago

Hi,

Here is a template script in the direction of what you need:

from datetime import datetime, timedelta
from opendrift.models.plastdrift import PlastDrift

o = PlastDrift(loglevel=0)
o.list_configspec()  # to see available configuration options

# Analytical mixing is faster than 'randomwalk'
o.set_config('vertical_mixing:mixingmodel', 'analytical')

# If most particles are close to surface, we may save time by only reading
# upper layers (e.g. 3m) from ocean model, and extrapolating below
o.set_config('drift:truncate_ocean_model_below_m', 3)

# SVIM current 1960- 30 Sept 2019
# GFS winds 6 May 2011 - present
o.add_readers_from_list([
    #'https://thredds.met.no/thredds/dodsC/sea/norkyst800m/1h/aggregate_be'
    'https://pae-paha.pacioos.hawaii.edu/thredds/dodsC/ncep_global/NCEP_Global_Atmospheric_Model_best.ncd',
    'https://thredds.met.no/thredds/dodsC/nansen-legacy-ocean/svim_daily_agg'
    ])

# Seeding some particles
seed_time = [datetime(2012, 1, 1, 0), datetime(2012, 1, 2, 0)]  # Seeding over two days
end_time = datetime(2012, 1, 3, 0)  # Simulating for 3 days

o.seed_elements(lon=2, lat=60, number=3000, z=-20, time=seed_time,
                terminal_velocity=.01  # 1 cm/s as default
                )

o.run(end_time=end_time, outfile='plast.nc',
      time_step=timedelta(hours=.5), time_step_output=timedelta(hours=1))

o.animation_profile()
o.animation()

NorKyst covers the time from Sep 2016 until now, but not the Barents Sea. I believe the best sources for your purpose (as in example above) are the SVIM ocean model (1960-2019) for currents, and the NCEP GFS atmoshperic model (6 May 2011 until present) for winds. Thus there are some years of overlap, and you can e.g. start your simulations 1 Jan 2012. This is the same forcing data as was used for the Arctic simulation you mention.

The number of particles and duration is only limited by the memory available on your computer. OpenDrift flushes to disk every 100 time steps as default, and thus it is feasible to use ~million of particles over many years. The only challenge is that the output files are getting very large, and it may not be possible to re-import the output file into memory, in able to use the built-in plotting/analysis methods. Unless you want to make your own analysis methods to handle huge datasets, a solution is to store the output data at larger intervals. E.g. for your purpose it should be sufficient to store output every 10 days, or even more seldomly. Note that the calculation time step should not be much larger than 1 hour, but this is independent.

You could play with the example script, and then increase the length and number of particles after confirming that output is as you like.

The terminal_velocity property of the particles would affect the drift, where higher velocities (i.e. larger particles) will drift closer to the surface (with Stokes and windage). The depth of seeding is probably not very important for such long term simulations, as particles will be mixed vertically very quickly after release.

mevo-creator commented 4 years ago

Hi @knutfrode , Thank you so much for detailed explanation of my query and providing me the example script to practise the simulations. I've tried running the script and it works well. I would like to learn more about in-built function and configurations for plastdrift model. For instance like you mentioned that analytical is faster than random walk, similarly there is another configuration: vertical_mixing:diffusivitymodel [windspeed_Large1994] option('environment', 'stepfunction', 'windspeed_Sundby1983', 'windspeed_Large1994', 'gls_tke', default='windspeed_Large1994'). So, what does different alternative mean in the diffusivity model and how it may affect the output from simulation and so on for other configurations as well. Do we have some documentation for PlastDrift model where these functionalities are explained? Please let me know about this.

knutfrode commented 4 years ago

The vertical diffusivity determines the degree to which elements are mixed upwards and downwards due to turbulence (random walk scheme as described in Nordam2019), partly counteracting buoyancy which is normally upwards/positive for plastics (but you may also specify negative buoyancy as is normal for e.g. sediment drift).

The value of vertical diffusivity is ideally acquired from an ocean model, in which case you should specify o.set_config('vertical_mixing:diffusivitymodel', 'environment')

However, this parameter is not always available in ocean model output files, and in this case it may be parameterized from wind (higher wind means more waves and thus vertical turbulence/mixing). E.g. it is included in the NorKyst model (with limited spatial and temporal coverage), but not in the SVIM model as in the example above. The parameterisation from wind based on Large1994 is the default for PlastDrift, and should be a good alternative to using data from the ocean model. If using the NorKyst model instead, you may experiment with using environment instead.

sundby1983 is a simpler/older version of Large1994, where there is no depth dependence of the diffusivity. stepfunction is mainly for concept studies (idealised pycnocline), and is not recommended for your case. gls_tke is a special case when using output from ROMS ocean model, where diffusivity is stored as other parameters from which diffusivity may be derived.

This example script illustrates the vertical variation and wind speed dependence of diffusivity for the various parameterisations.

The buoyancy is given by the terminal_velocity that you may specify at time of seeding. This is a very important parameter for your simulations, and you should aim to find the correct values for your polymers. Instead of providing a scalar terminal_velocity for all elements as in example above, you may also provide different values for each element.

mevo-creator commented 4 years ago

Thank you for explaining different functions for vertical diffusivity and sharing the paper about this concept...I'll try out 'environment' using Norkyst model. I need to bring a couple of points into your notice about what we are trying to do and if it can be achieved by PlastDrift model.

  1. I understand the importance of 'terminal velocity' as mentioned. However, I need to inform you that the polymers we are dealing with are Hydrolysed Polyacrylamide (HPAM) which are used for enhancing oil recovery. So, these HPAM based polymers are completely 'water-soluble' and not 'suspended'. They increase the viscosity of water when dissolved in the water. When this solution (polymer with water at higher viscosity) is injected in the reservoir it increases oil recovery by better sweeping the reservoir pores. there is a possibility that these polymers might also get a discharge in the marine environment. So if these polymers are discharged in the marine environment, they will not be 'suspended' but completely 'dissolved' in the water at very low concentrations (perhaps in the range of nanogram/litre, a bit similar to micro/nano plastics). Do you think PlastDrift would be suitable for such applications? Please let me know your opinion. I believe if PlastDrift can be used to track microplastics and nano plastics than it can also be used for polymers as well (both are almost similar) but would like to know your opinion.
  2. If we can use PlastDrift (I really hope so), the plan is to see the extent of exposure (geographical area/volume of water) of these polymers in the Northsea and further up. Also to see if there is an accumulation of polymers in any particular area within the ocean (something like "Garbage Patch" that are found for plastics, there is one in the Arctic)...So would like to see when polymers are discharged from a different area in the NorthSea, do they all come together in some particular area and increase the concentration. I guess this can be achieved by running different simulations over time and see if all polymers end up somewhere in the same area.
  3. From an environmental perspective, the positive thing about these polymers is that they are not 'persistent' like normal plastics, that means they will degrade but in the time scale of some years. (3-5 years even earlier if they are exposed to more sunlight)...so it would be interesting to see how and where these polymers will end up before they eventually degrade.

Please share your thoughts about this as per your convenience....thanks so much.

mevo-creator commented 4 years ago

Hi @knutfrode Is above query something that you can guide me with. I understand that the results might carry some uncertainty for polymers that gets dissolved in the water, but we can account for this uncertainty by stating some assumption. However, on a broad level, if I can use PlastDrift model to simulate something like I’ve described above. Thanks for your help.

knutfrode commented 4 years ago

If the polymers are dissolved in water, you should instead use the OceanDrift model, which is the parent class of the PlastDrift model.

You should then perhaps also leverage on the restriction to only read the upper few meters of the ocean, as polymers may be found at any depth. However, this also means that the simulation will take quite a bit longer, as more ocean model data has to be read/ingested. On the other hand, you may in this case skip the windage and Stokes drift, which only applies to the upper meter or so. And also you may skip vertical mixing, which is here less relevant. So I might suggest something like this (not tested):

from datetime import datetime, timedelta
from opendrift.models.oceandrift import OceanDrift

o = OceanDrift(loglevel=0)
o.list_configspec()  # to see available configuration options

# Skip vertical turbulent mixing and Stokes drift
o.set_config('drift:vertical_mixing', False)
o.set_config('drift:stokes_drift', False)

# SVIM current 1960- 30 Sept 2019
o.add_readers_from_list([
    #'https://thredds.met.no/thredds/dodsC/sea/norkyst800m/1h/aggregate_be'
    'https://thredds.met.no/thredds/dodsC/nansen-legacy-ocean/svim_daily_agg'
    ])

# Seeding some particles
seed_time = [datetime(2012, 1, 1, 0), datetime(2012, 1, 2, 0)]  # Seeding over two days
end_time = datetime(2012, 1, 3, 0)  # Simulating for 3 days

o.seed_elements(lon=2, lat=60, number=3000, z=-20, time=seed_time)

o.run(end_time=end_time, outfile='polymers.nc',
      time_step=timedelta(hours=.5), time_step_output=timedelta(hours=1))

o.animation(color='z')
o.animation_profile()

With these conditions (ie. flow is particle-independent), I would believe the outcome is already given: the water/polymers from anywhere in the Norwegian seas will generally drift nortwards, where much of it will enter the Barents/Arctic Seas, where it may reside for some years, before eventually flowing southwards again in the Fram Strait. But there will be temporal variations in the drift, so you should probably release particles during your whole simulation period.

mevo-creator commented 4 years ago

Hi @knutfrode , Appreciate your quick and detailed response. I'll run the script and come back to you for further doubts..!!

mevo-creator commented 4 years ago

Hi @knutfrode , Thanks again for your help. I ran the script provided by you and it worked well. However all the particles gets seeded at 20 m depth (I think because of z = -20 here o.seed_elements(lon=2, lat=60, number=3000, z=-20, time=seed_time)). The polymers will be randomly distributed along the depth and therefore based on this example, I changed the code as below and this also worked well. 1000 particles got seeded and drifted for 30 days.

from datetime import datetime, timedelta # import numpy as np from opendrift.models.oceandrift import OceanDrift from opendrift.readers import reader_netCDF_CF_generic o = OceanDrift(loglevel=0) o.list_configspec() # to see available configuration options o.set_config('drift:vertical_mixing', False) o.set_config('drift:stokes_drift', False) o.add_readers_from_list([

'https://thredds.met.no/thredds/dodsC/sea/norkyst800m/1h/aggregate_be'

'https://thredds.met.no/thredds/dodsC/nansen-legacy-ocean/svim_daily_agg'
])

z = -np.random.rand(1000)*50 seed_time = [datetime(2012, 1, 1, 0), datetime(2012, 1, 2, 0)] end_time = datetime(2012, 1, 30, 0) o.seed_elements(lon=4.8, lat=60.0, z=z, radius=0, number=1000, time=seed_time) o.run(end_time=end_time, outfile='polymers.nc', time_step=timedelta(hours=.5), time_step_output=timedelta(hours=1))

However, when I tried to seed 10000 particles with the same code as above, there was an error as below. There might be many polymer particles as the concentration will be in nanogram/litre. therefore would like to seed more particle than 1000. Also, randomly distributed throughout the depth. Can you please guide me if the above code is the right approach for randomly seeding particles across the depth and for the higher time period as well. Also, if it is a right approach how can we resolve the error below to seed more particle.

02:53:09 DEBUG: Parsing variable: wro WARNING:root:https://thredds.met.no/thredds/dodsC/nansen-legacy-ocean/svim_daily_agg is not a netCDF file recognised by OpenDrift: 'Reader' object has no attribute 'lon' 02:53:09 INFO: Opening dataset: https://thredds.met.no/thredds/dodsC/nansen-legacy-ocean/svim_daily_agg 02:53:09 INFO: Opening file with Dataset 02:53:09 WARNING: Vtransform not found, using 1 02:53:12 INFO: Read GLS parameters from file. 02:53:12 INFO: Making Splines for lon,lat to x,y conversion... 02:53:30 DEBUG: Setting buffer size 111 for reader roms native, assuming a maximum average speed of 5 m/s. 02:53:30 DEBUG: Adding new variable mappings 02:53:30 DEBUG: Adding method! 02:53:30 DEBUG: Reader initialised: roms native 02:53:30 DEBUG: Setting buffer size 24 for reader roms native, assuming a maximum average speed of 1 m/s. 02:53:30 DEBUG: Running lonlat2xy in parallel, using 2 of 4 CPUs 02:53:30 WARNING: Parallelprocessing failed: 02:53:30 WARNING: Can't pickle local object 'BaseReader.lonlat2xy..get_x' 02:53:30 WARNING: The simulation stopped before requested end time was reached. 02:53:30 INFO: ======================== 02:53:30 INFO: End of simulation: 02:53:30 INFO: cannot unpack non-iterable NoneType object 02:53:30 INFO: Traceback (most recent call last): File "C:\Users\mevo.conda\envs\opendrift\lib\site-packages\opendrift\models\basemodel.py", line 2250, in run self.get_environment(self.required_variables, File "C:\Users\mevo.conda\envs\opendrift\lib\site-packages\opendrift\models\basemodel.py", line 892, in get_environment len(reader.covers_positions( File "C:\Users\mevo.conda\envs\opendrift\lib\site-packages\opendrift\readers\basereader.py", line 904, in covers_positions x, y = self.lonlat2xy(lon, lat) TypeError: cannot unpack non-iterable NoneType object

02:53:30 INFO: 'The simulation stopped before requested end time was reached.'

Simulation aborted. 'The simulation stopped before requested end time was reached.'

Traceback (most recent call last): File "", line 1, in File "C:\Users\mevo.conda\envs\opendrift\lib\multiprocessing\spawn.py", line 107, in spawn_main new_handle = reduction.duplicate(pipe_handle, File "C:\Users\mevo.conda\envs\opendrift\lib\multiprocessing\reduction.py", line 79, in duplicate return _winapi.DuplicateHandle( OSError: [WinError 6] The handle is invalid

(opendrift) C:\Users\mevo>

knutfrode commented 4 years ago

There seem to be a Windows-issue(?) with the parallel processing. I will try to reproduce.

knutfrode commented 4 years ago

I could not reproduce this problem, so it seems to be a Windows issue. Some parallel processing failed, but in this case OpenDrift should fall back to sequential processing. The fallback did however not work, but is now fixed.

This page might explain why the parallell processing failed on your machine: https://stackoverflow.com/questions/52265120/python-multiprocessing-pool-map-attributeerror-cant-pickle-local-object

I implemented the suggestion there (https://github.com/OpenDrift/opendrift/commit/3be2199604df74ed5047e42dd6c0c10086624b7b), so if you now update your OpenDrift (git pull) it should hopefully work.

Since there is a new dependency in the meantime, you also need to update your conda environment. Updating directly with conda often fails, so I recommend deleting and creating anew with:

conda activate
conda env remove -n opendrift
conda env create -f environment.yml 
conda activate opendrift
mevo-creator commented 4 years ago

Hi @knutfrode , Thanks so much for the solution. I did a git pull and updated environment as you suggested and I was able to run the script for 10000 particles for 30 days. After git pull, the parallel processing fails but it falls back to single processor Multiprocessing has previously failed, reverting to using single processor for lonlat -> xy.

Now I wonder what could be the best alternative to simulate the same set of particles for a year or more than that. I remember your suggestion of storing output every 10 days (perhaps I can do it 30 days as well). But would it be possible to simulate same set of particles? for instance, now I have run this script for 30 days when I will run the script again for next 30 days, I need to seed the particles again and therefore previous particle trajectories will be lost somehow (even if it is stored in the output file). Please guide me how can we best approach longer time scale without disturbing particle trajectories from the previous simulation.

I guess the comment you made previously (please see below), will also play some role in reaching to solution for longer time scale.

But there will be temporal variations in the drift, so you should probably release particles during your whole simulation period.

knutfrode commented 4 years ago

You need to do a single, long simulation in able to track the same particles over a longer time.

You could e.g. seed particles over 2 years, and simulate for 3 years (i.e. the last year no more particles are released). I guess storing output every 10 days should be ok. If you are not using more than 10.000 particles, you should then still be able to use OpenDrift to open, plot and analyse the results. If output file is too large for memory, you need to make own analysis/plotting methods.

It is not a big problem that the parallelisation does not work, as this is just a minor part of the simulation. The most time consuming part is to read the ocean model data, so a 3 year simulation probably takes several days.

I recommend storing the log to a file, with

o = OceanDrift(loglevel=0, logfile='simulation.log')
mevo-creator commented 4 years ago

Hi @knutfrode , Thank you for your answer. I was able to run and plot the results for 1000 particles and 30 days. It takes some time for making .gif and mp4 file for animation (around 20-25 minutes). I think it might take a long time for more than 30 days to plot the results.

  1. Can you please guide me where exactly should I change the code to store output every 10 days as suggested by you, perhaps that can reduce the time? (I used this code for running the model now o.run(end_time=end_time, outfile='polymers.nc', time_step=timedelta(hours=.5), time_step_output=timedelta(hours=1)))
  2. Also, how is it different to seed the particles over two years rather than seeding it for instance within a month? Is it because it may take more time for the model to read the trajectories of a higher number of particles for a long time (more data generated) compared to reading the trajectories of relatively lesser number of particles over a long time. Please let me know if my understanding is correct or there is some other reason for seeding particle over two years.
  3. Also, when I run the script for 30 days, I'm getting the below error but the script continues to open dataset, finish simulation and plot the results, so is this error something that we need to worry about? WARNING:root:https://thredds.met.no/thredds/dodsC/nansen-legacy-ocean/svim_daily_agg is not a netCDF file recognised by OpenDrift: 'Reader' object has no attribute 'lon' 19:50:16 INFO: Opening dataset: https://thredds.met.no/thredds/dodsC/nansen-legacy-ocean/svim_daily_agg

Please guide me about this as per your convenience. Thanks so much!

knutfrode commented 4 years ago

The time to conduct a simulation is approximately proportional to the total simulation duration. The size of the output file is proportional to number of particles and number of output time steps. The time to make an animation is proportional to the number of output time steps, and might thus be faster for a long simulation with 10 days output: time_step_output=timedelta(days=10)

Seeding duration does not affect performance, only the number of particles.

The mentioned warning can be ignored.

mevo-creator commented 4 years ago

Hi, Appreciate your quick response...thank you.

mevo-creator commented 4 years ago

Hi @knutfrode , I ran simulations for 1 and 2 years for 1000 particles, the simulation runs ok. My primary goal behind running these simulations over a longer time scale is to get some idea regarding "accumulation" of the particles in a particluar area for a long time.

From one of your previous comment as quoted below,

With these conditions (ie. flow is particle-independent), I would believe the outcome is already given: the water/polymers from anywhere in the Norwegian seas will generally drift nortwards, where much of it will enter the Barents/Arctic Seas, where it may reside for some years, before eventually flowing southwards again in the Fram Strait. But there will be temporal variations in the drift, so you should probably release particles during your whole simulation period.

I can reconcile the drift of particles. For instance, from simulation over 2 years, majority of the particles end up in the Barents sea and some stay along the coastline. However, the particles are scattered all around, as we can see from the results below and perhaps bit difficult to say something concrete about accumulation.

The objective behind trying to see the "accumulation" of particles and/or "duration of accumulation" is to get some prediction about risk related to increase in concentration of polymers in a particular area when we have a continuous discharge. From the results below, we can say something about accumulation but perhaps not something precise.

So I need to ask two things

  1. If there is some other configuration/approach that can help us to get a more precise idea about accumulation.
  2. Also, what is the default size of the particles and is it possible to change the particle size somehow. I was just thinking if we can make particles smaller then may be it is bit more easier to get an idea about accumulation, but not sure about this.

I understand that the drift of particles will be quite random, but just wondering if there is some other function/configuration within the model that can help me to come close to what I'm trying to achieve in terms of accumulation.

Can you please comment on this as per your convenience, thank you.

730nostranding2

knutfrode commented 4 years ago

Your output seems realistic. And yes there is a huge scatter, but this is reality, and not only the model. Thus to give good answers, one need ideally as many particles and as long time series as possible. This again leads back to the practical problem that it is more work to analyse a dataset which is too large to be processed within memory, and to be plotted/animated.

However, since you are interested in concentrations, and not the trajectory of individual particles, perhaps you may store the output as infrequently as e.g. every month, or even more seldomly? Maybe you could do a 10 year simulation with ~500.000 particles and monthly output. But it would be useful to first make an assesment of how large the output file will be, and if you can fit it into memory, of if you need to make your own analysis methods.

Btw, to animate particle density and not individual particles, you can do something like:

>>> o.animation(density=True, density_pixelsize_m=50000, show_elements=False, fast=True)

And also btw, I added a feature yesterday to import only a subset of the particles from a stored netCDF file:

>>> o = opendrift.open('huge_file.nc', elements=np.arange(0, 10000))  # to read only the first 10.000 elements

For the OceanDrift model (as for PlastDrift) particle size is not an element property directly, but rather terminal_velocity, which depends on the particle size and density. The terminal velocity of OceanDrift is 0, so here you are tracking water particles/masses with no buoyancy. What is the correct terminal_velocity depends on your particles/problem. You may provide terminal_velocity as an input to your seed-command, as a scalar, or array with same length as the number of particles, see e.g. this example.

mevo-creator commented 4 years ago

Hi @knutfrode , Thank you for the feedback. I'm more interested in the concentration. So, if I run longer time scale simulation with more seldom stored output, does it somehow affect the trajectory of the particles ?

How can i assess the size of the output file in advance to see if it fits into the memory?

Also, for 2 year simulation, when i was going through the log file, I found that some particles are going out of the coverage area (please see the below log) and fallback value of 0 is used for some particles. Does this give inaccurate results if elements go out of coverage area? do i need to worry about this for a longer time scale simulation for around 500000 particles?

16:54:13 DEBUG: Interpolating before (2013-12-30 12:00:00) in space (linearNDFast) 16:54:13 DEBUG: Initialising interpolator. 16:54:13 DEBUG: Interpolating after (2013-12-31 12:00:00) in space (linearNDFast) 16:54:13 DEBUG: Initialising interpolator. 16:54:13 DEBUG: Interpolating before (2013-12-30 12:00:00, weight 0.54) and after (2013-12-31 12:00:00, weight 0.46) in time 16:54:13 INFO: Interpolating profiles in time 16:54:13 DEBUG: Rotating vectors between 0.0 and 0.0 degrees. 16:54:13 DEBUG: Masking 25 elements outside coverage 16:54:13 DEBUG: Data missing for 25 elements. 16:54:13 DEBUG: --------------------------------------- 16:54:13 DEBUG: Finished processing all variable groups 16:54:13 DEBUG: Using fallback value 0 for x_sea_water_velocity for 25 elements 16:54:13 DEBUG: Using fallback value 0 for y_sea_water_velocity for 25 elements 16:54:13 DEBUG: Using fallback value 0 for upward_sea_water_velocity for 25 elements 16:54:13 DEBUG: Using fallback value 0.02 for ocean_vertical_diffusivity for 25 elements 16:54:13 DEBUG: Using fallback value 0.02 for ocean_vertical_diffusivity for 0 profiles 16:54:13 DEBUG: ...plus 1 individual points in other profiles 16:54:13 DEBUG: Using fallback value 10000 for sea_floor_depth_below_sea_level for 25 elements

Also, thank you for suggesting idea of plotting particle density, i think it could be useful. I did plot particle density for 2 year simulation and below plot was generated. I wonder how do i read this plot as i can only see dark brown pixels, should we not see other colors where there are not enough particles? density730

I understand the dependence of particle properties on terminal velocity. For my problem, I'm ok with terminal velocity as zero. Since polymers are water soluble, I'm happy tracking them as water particles/masses of water. With the help of Ocean drift model, If I could demonstrate that these water masses somehow gets stuck somehwere (for instance in Barents/Arctic Sea) for a reasonable amount of time, that serves my purpose. Also, if water masses doesn't get stuck anywhere for a long time, that is also a fruitful conclusion.

knutfrode commented 4 years ago

The output time step does not influence the particle trajectories, as these only depend on the calculation time step (time_step, which should not be too large, 1h should be ok).

Note sure how to accurately assess size, but as a first estimate you could scale the size of your present file with the number of desired elements and time steps, divided by the present number of elements and time steps. Anyway, I guess using the new feature to only import a subset of the particles could help if the file becomes too large for memory.

I dont think you need to worry about the particles leaving the SVIM domain (east of Iceland). In principle some of these could return into the domain, but probably negligible.

Yes, it is a bit suspicious that the color scale of density has max at 1. This should automatically be the max number of elements within any cell. Do you provide vmax=1 to your command? Could you otherwise show your animation command? I did a related update yesterday, so you could also try to update your local OpenDrift and see if it makes a difference.

mevo-creator commented 4 years ago

Thank you for the answers.

I'm using this command to plot density of particles. >>> o.animation(density=True, density_pixelsize_m=50000, show_elements=False, fast=True, filename='density730.gif'). is there some issue with the command? I'll also update the model and see.

knutfrode commented 4 years ago

No, this should be ok, although I would recommend 'mp4' which gives smaller files, with possibility to pause. mp4s can be converted to gif's if needed for presentation, with included commandline tool mp4_to_gif.bash

Hopefully updating OpenDrift should give correct vmax. Otherwise you may also provide it manually, but is then hard to find an optimal value.

mevo-creator commented 4 years ago

Ok, thank you for recommending mp4 files. I was able to create mp4 file, however there was some error as below. Can we just avoid this error?

>>> o.animation(density=True, density_pixelsize_m=50000, show_elements=False, fast=True, filename='density731.mp4') WARNING:matplotlib.animation:MovieWriter stderr: [libx264 @ 000002afe6b6ef80] height not divisible by 2 (1100x683) Error initializing output stream 0:0 -- Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height

knutfrode commented 4 years ago

This is just a warning from matplotlib that it is not able to set some parameters correctly. As long as you get an animation, it can be neglected.

mevo-creator commented 4 years ago

Ok, I do get an animation. I'll update the model with git pull and see how it goes with the density plots, thank you so much!

mevo-creator commented 4 years ago

Hi, I did update the model and i can see the below confirmation. After this, i tried density plot again it still gives me the same plot as before. Not really sure if I am doing something wrong in updating the model. Is there some way to crosscheck if my model has been updated? or perhaps we need to change 'vmax' as you mentioned before. can you please look into this, thanks!

(opendrift) C:\Users\mevo\opendrift>git pull Already up to date.

P:S As suggested, I did follow the below steps while updating the model alongwith alternative 2 in the installation procedure.

conda activate conda env remove -n opendrift conda env create -f environment.yml conda activate opendrift

knutfrode commented 4 years ago

I believe this commit should fix the problem.

So please update again with git pull. It is not necessary to update the conda environment every time, this is only when there are new dependencies (quite rarely).

mevo-creator commented 4 years ago

Hi, Thank you for doing the needul. After updating I'm getting the below plot. I'm using this command to create this plot o.animation(density=True, density_pixelsize_m=50000, show_elements=False, fast=True, filename='density730.gif')

Now its mostly on the other side of the axis (blue color). What exactly does the colored axis tell us, can you please let me know.

Also, just to inform you that 2 years simulation for 1000 particles with stored output every 10 days creates output file (.nc type) of 7 MB. Is there some size limit above which I cannot use in-built plotting tools in the model.

density730

knutfrode commented 4 years ago

The color is simply the number of elements within the given boxes (here 50*50km). If you want to see more details, you may provide vmax=30, at the cost of saturating the boxes with more particles.

Files up to a few GB should be fine to import into memory, but plotting/animation will become gradually slower (in the end unbearable slow). For this scale, I would recommend using at least 100.000 elements.

mevo-creator commented 4 years ago

Ok, I understand. I did change vmax to 30 and i was able to see bit more colors, however, it is mostly blue. As you suggested, it is because of overall lesser number of particles for such a large scale. So, I'll try and simulate 100.000 particles and see how it goes. Thanks so much for the guidance.

mevo-creator commented 4 years ago

Hi, I am trying to inspect and plot the coverage of reader (by this method >>>reader_norkyst.plot()) that is used for these simulations ('https://thredds.met.no/thredds/dodsC/nansen-legacy-ocean/svim_daily_agg'). I need to see the total coverage area and variables. However, when I try to read data set I'm getting below error. Is there any other way to read the data set and see the coverage area? thank you.

>>> reader_nansen = reader_netCDF_CF_generic.Reader('https://thredds.met.no/thredds/dodsC/nansen-legacy-ocean/svim_daily_agg') 13:39:41 INFO: Opening dataset: https://thredds.met.no/thredds/dodsC/nansen-legacy-ocean/svim_daily_agg 13:39:41 INFO: Opening file with Dataset Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Miniconda\envs\opendrift\lib\site-packages\opendrift\readers\reader_netCDF_CF_generic.py", line 290, in __init__ if self.lon.ndim == 1: AttributeError: 'Reader' object has no attribute 'lon'

knutfrode commented 4 years ago

The reason is that this dataset is not netCDF-CF-compliant, but is raw output from the ROMS ocean model. Thus another reader has to be used:

>>> from opendrift.readers.reader_ROMS_native import Reader
>>> r = Reader('https://thredds.met.no/thredds/dodsC/nansen-legacy-ocean/svim_daily_agg')
>>> print(r)

Reader: roms native
Projection: 
  None
Coverage: [pixels]
  xmin: 0.000000   xmax: 1201.000000   step: 1   numx: 1202
  ymin: 0.000000   ymax: 579.000000   step: 1   numy: 580
  Corners (lon, lat):
    (-28.67,  52.85)  (132.13,  81.84)
    (  0.58,  46.59)  ( 75.76,  64.63)
Vertical levels [sigma]: 
  [-0.984375 -0.953125 -0.921875 -0.890625 -0.859375 -0.828125 -0.796875
 -0.765625 -0.734375 -0.703125 -0.671875 -0.640625 -0.609375 -0.578125
 -0.546875 -0.515625 -0.484375 -0.453125 -0.421875 -0.390625 -0.359375
 -0.328125 -0.296875 -0.265625 -0.234375 -0.203125 -0.171875 -0.140625
 -0.109375 -0.078125 -0.046875 -0.015625]
Available time range:
  start: 1960-01-01 12:00:00   end: 2018-09-30 12:00:00   step: 1 day, 0:00:00
    21458 times (0 missing)
Variables:
  sea_floor_depth_below_sea_level
  land_binary_mask
  land_binary_mask
  ocean_vertical_diffusivity
  sea_surface_height
  sea_ice_x_velocity
  sea_ice_y_velocity
  sea_ice_area_fraction
  sea_ice_thickness
  sea_water_salinity
  sea_water_temperature
  x_sea_water_velocity
  y_sea_water_velocity
  upward_sea_water_velocity
mevo-creator commented 4 years ago

Thank you, this dataset covers a relatively larger area than the one covered by Norkyst. I've started a simulation of 100.000 particles for 5 years with output stored every 30 days. It seems like it will take 3-4 days to complete the simulation. Just needed to see how much area is covered by the SVIM data set and it needs to be seen if all particles will remain inside of this coverage area during/after 5 years. Figure_1

Also, I was going through your paper on Opendrift model, I wonder what is the difference between OceanDrift: Passive tracer and OpenDrift3D in Figure 6. Does OpenDrift3D have some extra features than OceanDrift: Passive tracer?

knutfrode commented 4 years ago

The class inheritance has been modified and simplified since that paper was written. Now all modules are in principle 3D models, which can be configured to be 2D if desired. OpenDrift(3D) can not be used directly, but only provides some common functionality to be inherited by sub-modules/classes.

OceanDrift is for passive tracers, though with possibility to specify terminal_velocity (i.e. making them "active", by moving relative to water). The goal is always to store re-usable functionality "higher up", so that only very specific things are needed in each sub-model/class. Thus e.g. the PlastDrift module adds very little to the OceanDrift model. For your case where particles are tiny/dissolved, OceanDrift is the best choice.

Btw: with 5 years and 100.000 particles, I believe it should still be ok to store output e.g. every 7 days, allowing to resolve the evolution with more detail. This would give ~250 output time steps, which should normally fit into memory. Thus you might consider re-starting your simulation.

mevo-creator commented 4 years ago

Alright got it.

I'll restart the simulation. After 3 hours of starting the simulation, the number of steps has reached 2134 out of 87648. So the average of around 750 steps in an hour, which might take about 5 days to complete simulation. Now, the time step is 0.5 hours should I increase to 1 hour, can it help in bit faster simulation without compromising on accuracy?

o.run(end_time=end_time, outfile='5years.nc', time_step=timedelta(hours=.5), time_step_output=timedelta(days=30))

Thanks so much for your recommendation.

knutfrode commented 4 years ago

For this scale, 1h should be ok, though it might only improve the performance by about 25%, as you still have to read the same amount of input data. So you might consider keeping 0.5h, with slightly better accuracy. Your simulation might also run a bit slower after some time, when there are more particles released. So your estimate is probably a lower bound.

Unfortunately, if the simulation should be aborted for some reason, it is not possible to continue where is stopped. So hopefully it will complete at first try.

mevo-creator commented 4 years ago

Ok, I'll keep 0.5 hours and 7 days. Actually, I'm seeding all particles within 8 days (this is the usual time frame of polymer discharge). So all particles were released after an hour or so. I guess seeding duration will not affect the trajectory for such a long simulation right?

mevo-creator commented 4 years ago

Also, I'm using this configuration >>> o.set_config('general:coastline_action', 'previous') so that particles are not stranded, if I do not use this many particles will get stranded. Hope it is ok to use this configuration.

knutfrode commented 4 years ago

Ok, but in any case the particles will spread over a gradually larger area, and then more ocean model data must be read and interpolated. So performance will anyway drop gradually.

Seeding over 8 days is fine. However, repeating the simulation for another 5 years with another 8-day-seeding interval (e.g. 1 year before or after) might provide rather different results, as there are year-to-year variations in ocean currents and wind conditions. Also the first few weeks after the release might be especially important for where the particles end up. Thus by continuous seeding over the full period, you would be covering a larger probability space. Various 8d seeding intervals might anyway be imposed when analysing the result afterwards, but selecting various subsets of the particles.

Yes, it is ok to disable stranding for this purpose.