hainegroup / oceanspy

A Python package to facilitate ocean model data analysis and visualization.
https://oceanspy.readthedocs.io
MIT License
96 stars 32 forks source link

run time for extracting data #22

Closed hooteoos-waltz closed 5 years ago

hooteoos-waltz commented 5 years ago

Hi,

I am trying to extract a data for a big time-range. And it's been more than 10 minutes and it still hasen't been generated. My subsampling timeRange = ['2007-09-01T00', '2008-07-31T00'] (first of all, do I put the format right? I'm trying to have sep1st 2007 to July 30th 2008 time chunk). Do you have any idea how long it will take?

Bests,

malmans2 commented 5 years ago

Format is right. If you're using subsample.cutout, it should be immediate because you only need lazy operations. If you're using subsample.survey, then you are doing heavy interpolations on every field in the dataset. So it depends on the number of fields, the functions that you previously run on these fields, and the size of the vertical section. You should you the SciServer Jobs to run long scripts. E.g., submit a job to extract vertical sections, then load the sections and work interactively.

hooteoos-waltz commented 5 years ago

Hi,

I submitted a job through SciServer, I chose the compute image as Geo, and submitted the job. after a couple of testing, I noticed it cannot import the oceanpy.

The test job is as follow: a=2+2 b=a+3 print(b) this job went through, and ends successfully. I then submitted a job with a single command: import oceanspy Job ended with an error. Exit code: "1". Error message: "". And here is the standard error file:

[NbConvertApp] Converting notebook /home/idies/workspace/Storage/asaberi2/persistent/jobs/20180811/20180811233645-11247/Job_test.ipynb to notebook [NbConvertApp] Executing notebook with kernel: python3 [NbConvertApp] ERROR | Error while converting '/home/idies/workspace/Storage/asaberi2/persistent/jobs/20180811/20180811233645-11247/Job_test.ipynb' Traceback (most recent call last): File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/nbconvertapp.py", line 393, in export_single_notebook output, resources = self.exporter.from_filename(notebook_filename, resources=resources) File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/exporters/exporter.py", line 174, in from_filename return self.from_file(f, resources=resources, kw) File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/exporters/exporter.py", line 192, in from_file return self.from_notebook_node(nbformat.read(file_stream, as_version=4), resources=resources, kw) File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/exporters/notebook.py", line 31, in from_notebook_node nb_copy, resources = super(NotebookExporter, self).from_notebook_node(nb, resources, **kw) File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/exporters/exporter.py", line 134, in from_notebook_node nb_copy, resources = self._preprocess(nb_copy, resources) File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/exporters/exporter.py", line 311, in _preprocess nbc, resc = preprocessor(nbc, resc) File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/preprocessors/base.py", line 47, in call return self.preprocess(nb, resources) File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/preprocessors/execute.py", line 262, in preprocess nb, resources = super(ExecutePreprocessor, self).preprocess(nb, resources) File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/preprocessors/base.py", line 69, in preprocess nb.cells[index], resources = self.preprocess_cell(cell, resources, index) File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/preprocessors/execute.py", line 286, in preprocess_cell raise CellExecutionError.from_cell_and_msg(cell, out) nbconvert.preprocessors.execute.CellExecutionError: An error occurred while executing the following cell:

a=2+2

b=a+3

print(b)

import oceanspy

--------------------------------------------------------------------------- ModuleNotFoundError Traceback (most recent call last)  in ()  2 #b=a+3  3 #print(b) ----> 4 import oceanspy  ModuleNotFoundError: No module named 'oceanspy' ModuleNotFoundError: No module named 'oceanspy'

Best, Atousa

malmans2 commented 5 years ago

In the first cell you have to install OceanSpy and its dependencies (check out the documentation here).

import sys
!conda install --yes --prefix {sys.prefix} dask distributed bottleneck netCDF4
!conda install --yes --prefix {sys.prefix} -c conda-forge xarray cartopy esmpy
!conda install --yes --prefix {sys.prefix} -c pyviz hvplot geoviews
!{sys.executable} -m pip install xgcm xesmf oceanspy

Use the pip install git+ command to install the latest version, or a specific branch.

hooteoos-waltz commented 5 years ago

Hey! I added the installation commands at the top of the note book I submitted as a job. I chose Python+R as the compute image; after 12 minutes the job failed. Check out the standard error: [NbConvertApp] Converting notebook /home/idies/workspace/Storage/asaberi2/persistent/jobs/20180815/20180815111030-11274/Extract_VerticalSection_job.ipynb to notebook [NbConvertApp] Executing notebook with kernel: python3 [NbConvertApp] ERROR | Error while converting '/home/idies/workspace/Storage/asaberi2/persistent/jobs/20180815/20180815111030-11274/Extract_VerticalSection_job.ipynb' Traceback (most recent call last): File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/nbconvertapp.py", line 393, in export_single_notebook output, resources = self.exporter.from_filename(notebook_filename, resources=resources) File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/exporters/exporter.py", line 174, in from_filename return self.from_file(f, resources=resources, kw) File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/exporters/exporter.py", line 192, in from_file return self.from_notebook_node(nbformat.read(file_stream, as_version=4), resources=resources, kw) File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/exporters/notebook.py", line 31, in from_notebook_node nb_copy, resources = super(NotebookExporter, self).from_notebook_node(nb, resources, **kw) File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/exporters/exporter.py", line 134, in from_notebook_node nb_copy, resources = self._preprocess(nb_copy, resources) File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/exporters/exporter.py", line 311, in _preprocess nbc, resc = preprocessor(nbc, resc) File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/preprocessors/base.py", line 47, in call return self.preprocess(nb, resources) File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/preprocessors/execute.py", line 262, in preprocess nb, resources = super(ExecutePreprocessor, self).preprocess(nb, resources) File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/preprocessors/base.py", line 69, in preprocess nb.cells[index], resources = self.preprocess_cell(cell, resources, index) File "/home/idies/miniconda3/lib/python3.6/site-packages/nbconvert/preprocessors/execute.py", line 286, in preprocess_cell raise CellExecutionError.from_cell_and_msg(cell, out) nbconvert.preprocessors.execute.CellExecutionError: An error occurred while executing the following cell:

install the oceanspy

import sys !conda install --yes --prefix {sys.prefix} dask distributed bottleneck netCDF4 !conda install --yes --prefix {sys.prefix} -c conda-forge xarray cartopy esmpy !conda install --yes --prefix {sys.prefix} -c pyviz hvplot geoviews !{sys.executable} -m pip install xgcm xesmf oceanspy

This is to extract big chunk of data

import oceanspy as ospy ds, info = ospy.open_dataset.exp_ASR() ds_survey, info_survey = ospy.subsample.survey(ds, info, lat1 = 67.5293, lon1 = -23.7777, lat2 = 68.3158, lon2 = -25.4916, delta_km = 2, varList = ['U', 'V', 'Depth'], depthRange = [0, -1550], timeRange = ['2007-09-01T00', '2008-07-31T00'], deep_copy = True) ds_rot, info_rot = ospy.compute.ort_Vel(ds_survey, info_survey, deep_copy=True) ds_rot, info_rot = ospy.compute.tan_Vel(ds_rot, inforot) = ospy.visualize.interactive(ds_rot.ort_Vel, info_rot, hvplot_kwargs={'kind': 'contourf', 'cmap':'seismic', 'clim':(-1,1), 'levels':20})

Atousa wants to save this file

path = './Data/Kogur1_vertVel_correspond2Hardenmooring_ASR' ospy.utils.save_ds_info(ds_rot, info_rot, path)


--------------------------------------------------------------------------- FileNotFoundError Traceback (most recent call last)  in ()  8 #This is to extract big chunk of data  9 import oceanspy as ospy ---> 10 ds, info = ospy.open_dataset.exp_ASR()  11 ds_survey, info_survey = ospy.subsample.survey(ds,  12 info,

~/miniconda3/lib/python3.6/site-packages/oceanspy/open_dataset.py in exp_ASR(cropped, machine)  108 else: raise RuntimeError("machine = %s not available" % machine.lower())  109 gridset = _xr.open_dataset(gridpath, --> 110 drop_variables = ['XU','YU','XV','YV','RC','RF','RU','RL'])  111 fldsset = _xr.open_mfdataset(fldspath,  112 concat_dim = 'T',

~/miniconda3/lib/python3.6/site-packages/xarray/backends/api.py in open_dataset(filename_or_obj, group, decode_cf, mask_and_scale, decode_times, autoclose, concat_characters, decode_coords, engine, chunks, lock, cache, drop_variables, backend_kwargs)  318 group=group,  319 autoclose=autoclose, --> 320 **backend_kwargs)  321 elif engine == 'scipy':  322 store = backends.ScipyDataStore(filename_or_obj,

~/miniconda3/lib/python3.6/site-packages/xarray/backends/netCDF4_.py in open(cls, filename, mode, format, group, writer, clobber, diskless, persist, autoclose, lock)  330 diskless=diskless, persist=persist,  331 format=format) --> 332 ds = opener()  333 return cls(ds, mode=mode, writer=writer, opener=opener,  334 autoclose=autoclose, lock=lock)

~/miniconda3/lib/python3.6/site-packages/xarray/backends/netCDF4_.py in _open_netcdf4_group(filename, mode, group, kwargs)  229 import netCDF4 as nc4  230  --> 231 ds = nc4.Dataset(filename, mode=mode, kwargs)  232   233 with close_on_error(ds):

netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Dataset.init()

netCDF4/_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success()

FileNotFoundError: [Errno 2] No such file or directory: b'/home/idies/workspace/OceanCirculation/exp_ASR/grid_glued.nc' FileNotFoundError: [Errno 2] No such file or directory: b'/home/idies/workspace/OceanCirculation/exp_ASR/grid_glued.nc'

...... Do you have oceanspy on datascope as well? I'd like to set a job there to extract the section I need.

Thanks, Atousa

malmans2 commented 5 years ago

The code looks fine. Your error says that OceanCirculation is not available: did you selected OceanCirculation during the submission process?

hooteoos-waltz commented 5 years ago

Hey,

I selected OceanCirculation this time, and I submitted it as a small job as I assumed it shouldn't take more than an hour. But my jobs was terminated by "Job timed out" error. I tried to submit a large job domain and since an hour ago the job is submitted but not started.

malmans2 commented 5 years ago

Let me know if the job we submitted together got to the end, so I can close the issue.

hooteoos-waltz commented 5 years ago

Yes. The job completed and the output was saved in the right place. Thanks