destination-earth / DestinE_ESA_GFTS

Global Fish Tracking Service - DestinE DESP Use Case
https://destination-earth.github.io/DestinE_ESA_GFTS/
Apache License 2.0
10 stars 6 forks source link

Panel notebook for tag data #26

Closed aderrien7 closed 6 months ago

aderrien7 commented 6 months ago

The noteboook can generate panel for tag data by accesing them from s3 bucket

annefou commented 6 months ago

I am not sure why but I am getting a permission error:

---------------------------------------------------------------------------
ClientError                               Traceback (most recent call last)
File [/srv/conda/envs/notebook/lib/python3.11/site-packages/s3fs/core.py:113](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/s3fs/core.py#line=112), in _error_wrapper(func, args, kwargs, retries)
    112 try:
--> 113     return await func(*args, **kwargs)
    114 except S3_RETRYABLE_ERRORS as e:

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/aiobotocore/client.py:408](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/aiobotocore/client.py#line=407), in AioBaseClient._make_api_call(self, operation_name, api_params)
    407     error_class = self.exceptions.from_code(error_code)
--> 408     raise error_class(parsed_response, operation_name)
    409 else:

ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

The above exception was the direct cause of the following exception:

PermissionError                           Traceback (most recent call last)
Cell In[4], line 11
      9 track_plot = pn.bind(plot_track,tag_id=tag_widget)
     10 emission_plot = pn.bind(plot_emission,tag_id=tag_widget)
---> 11 track_emission = pn.Row(time_plot,track_plot,emission_plot)
     13 #Combining plots with the widget
     14 plots = pn.Row(tag_widget,track_emission)

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/layout/base.py:825](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/layout/base.py#line=824), in ListPanel.__init__(self, *objects, **params)
    821     if 'objects' in params:
    822         raise ValueError("A %s's objects should be supplied either "
    823                          "as positional arguments or as a keyword, "
    824                          "not both." % type(self).__name__)
--> 825     params['objects'] = [panel(pane) for pane in objects]
    826 elif 'objects' in params:
    827     objects = params['objects']

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/layout/base.py:825](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/layout/base.py#line=824), in <listcomp>(.0)
    821     if 'objects' in params:
    822         raise ValueError("A %s's objects should be supplied either "
    823                          "as positional arguments or as a keyword, "
    824                          "not both." % type(self).__name__)
--> 825     params['objects'] = [panel(pane) for pane in objects]
    826 elif 'objects' in params:
    827     objects = params['objects']

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/pane/base.py:87](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/pane/base.py#line=86), in panel(obj, **kwargs)
     85 if kwargs.get('name', False) is None:
     86     kwargs.pop('name')
---> 87 pane = PaneBase.get_pane_type(obj, **kwargs)(obj, **kwargs)
     88 if len(pane.layout) == 1 and pane._unpack:
     89     return pane.layout[0]

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/param.py:815](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/param.py#line=814), in ParamRef.__init__(self, object, **params)
    813 self._validate_object()
    814 if not self.defer_load:
--> 815     self._replace_pane()

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/param.py:883](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/param.py#line=882), in ParamRef._replace_pane(self, force, *args)
    881 else:
    882     try:
--> 883         new_object = self.eval(self.object)
    884         if new_object is Skip and new_object is Undefined:
    885             self._inner_layout.loading = False

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/param.py:1106](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/param.py#line=1105), in ParamFunction.eval(self, ref)
   1104 @classmethod
   1105 def eval(self, ref):
-> 1106     return eval_function_with_deps(ref)

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/param/parameterized.py:165](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/param/parameterized.py#line=164), in eval_function_with_deps(function)
    163         args = (getattr(dep.owner, dep.name) for dep in arg_deps)
    164         kwargs = {n: getattr(dep.owner, dep.name) for n, dep in kw_deps.items()}
--> 165 return function(*args, **kwargs)

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/param/depends.py:53](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/param/depends.py#line=52), in depends.<locals>._depends(*args, **kw)
     51 @wraps(func)
     52 def _depends(*args, **kw):
---> 53     return func(*args, **kw)

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/param/reactive.py:594](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/param/reactive.py#line=593), in bind.<locals>.wrapped(*wargs, **wkwargs)
    591 @depends(**dependencies, watch=watch)
    592 def wrapped(*wargs, **wkwargs):
    593     combined_args, combined_kwargs = combine_arguments(wargs, wkwargs)
--> 594     return eval_fn()(*combined_args, **combined_kwargs)

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/io/cache.py:433](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/io/cache.py#line=432), in cache.<locals>.wrapped_func(*args, **kwargs)
    431         func_cache[hash_value] = (ret, ts, count+1, time)
    432 else:
--> 433     ret = func(*args, **kwargs)
    434     with lock:
    435         func_cache[hash_value] = (ret, time, 0, time)

Cell In[3], line 7, in plot_time_series(plot_type, tag_id)
      2 @pn.cache
      3 
      4 # Functions to plot the different visualization for a given tag id
      5 def plot_time_series(plot_type="time series",tag_id="CB_A11071"):
      6     # load trajectories 
----> 7     trajectories = read_trajectories(track_modes,f"{scratch_root}[/](https://gfts.minrk.net/){tag_id}",storage_options, format="parquet")
      9     # Converting the trajectories to pandas DataFrames to access data easily
     10     mean_df = trajectories.trajectories[0].df

File [~/pangeo-fish/pangeo_fish/io.py:275](https://gfts.minrk.net/user/annefou/pangeo-fish/pangeo_fish/io.py#line=274), in read_trajectories(names, root, storage_options, format)
    272 if reader is None:
    273     raise ValueError(f"unknown format: {format}")
--> 275 return mpd.TrajectoryCollection([reader(root, name) for name in names])

File [~/pangeo-fish/pangeo_fish/io.py:275](https://gfts.minrk.net/user/annefou/pangeo-fish/pangeo_fish/io.py#line=274), in <listcomp>(.0)
    272 if reader is None:
    273     raise ValueError(f"unknown format: {format}")
--> 275 return mpd.TrajectoryCollection([reader(root, name) for name in names])

File [~/pangeo-fish/pangeo_fish/io.py:261](https://gfts.minrk.net/user/annefou/pangeo-fish/pangeo_fish/io.py#line=260), in read_trajectories.<locals>.read_parquet(root, name)
    258 def read_parquet(root, name):
    259     path = f"{root}[/](https://gfts.minrk.net/){name}.parquet"
--> 261     df = pd.read_parquet(path,
    262                          storage_options=storage_options)
    264     return mpd.Trajectory(df, name, x="longitude", y="latitude")

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/pandas/io/parquet.py:667](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/pandas/io/parquet.py#line=666), in read_parquet(path, engine, columns, storage_options, use_nullable_dtypes, dtype_backend, filesystem, filters, **kwargs)
    664     use_nullable_dtypes = False
    665 check_dtype_backend(dtype_backend)
--> 667 return impl.read(
    668     path,
    669     columns=columns,
    670     filters=filters,
    671     storage_options=storage_options,
    672     use_nullable_dtypes=use_nullable_dtypes,
    673     dtype_backend=dtype_backend,
    674     filesystem=filesystem,
    675     **kwargs,
    676 )

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/pandas/io/parquet.py:274](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/pandas/io/parquet.py#line=273), in PyArrowImpl.read(self, path, columns, filters, use_nullable_dtypes, dtype_backend, storage_options, filesystem, **kwargs)
    267 path_or_handle, handles, filesystem = _get_path_or_handle(
    268     path,
    269     filesystem,
    270     storage_options=storage_options,
    271     mode="rb",
    272 )
    273 try:
--> 274     pa_table = self.api.parquet.read_table(
    275         path_or_handle,
    276         columns=columns,
    277         filesystem=filesystem,
    278         filters=filters,
    279         **kwargs,
    280     )
    281     result = pa_table.to_pandas(**to_pandas_kwargs)
    283     if manager == "array":

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/parquet/core.py:1776](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/parquet/core.py#line=1775), in read_table(source, columns, use_threads, schema, use_pandas_metadata, read_dictionary, memory_map, buffer_size, partitioning, filesystem, filters, use_legacy_dataset, ignore_prefixes, pre_buffer, coerce_int96_timestamp_unit, decryption_properties, thrift_string_size_limit, thrift_container_size_limit, page_checksum_verification)
   1770     warnings.warn(
   1771         "Passing 'use_legacy_dataset' is deprecated as of pyarrow 15.0.0 "
   1772         "and will be removed in a future version.",
   1773         FutureWarning, stacklevel=2)
   1775 try:
-> 1776     dataset = ParquetDataset(
   1777         source,
   1778         schema=schema,
   1779         filesystem=filesystem,
   1780         partitioning=partitioning,
   1781         memory_map=memory_map,
   1782         read_dictionary=read_dictionary,
   1783         buffer_size=buffer_size,
   1784         filters=filters,
   1785         ignore_prefixes=ignore_prefixes,
   1786         pre_buffer=pre_buffer,
   1787         coerce_int96_timestamp_unit=coerce_int96_timestamp_unit,
   1788         thrift_string_size_limit=thrift_string_size_limit,
   1789         thrift_container_size_limit=thrift_container_size_limit,
   1790         page_checksum_verification=page_checksum_verification,
   1791     )
   1792 except ImportError:
   1793     # fall back on ParquetFile for simple cases when pyarrow.dataset
   1794     # module is not available
   1795     if filters is not None:

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/parquet/core.py:1329](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/parquet/core.py#line=1328), in ParquetDataset.__init__(self, path_or_paths, filesystem, schema, filters, read_dictionary, memory_map, buffer_size, partitioning, ignore_prefixes, pre_buffer, coerce_int96_timestamp_unit, decryption_properties, thrift_string_size_limit, thrift_container_size_limit, page_checksum_verification, use_legacy_dataset)
   1327     except ValueError:
   1328         filesystem = LocalFileSystem(use_mmap=memory_map)
-> 1329 finfo = filesystem.get_file_info(path_or_paths)
   1330 if finfo.is_file:
   1331     single_file = path_or_paths

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/_fs.pyx:581](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/_fs.pyx#line=580), in pyarrow._fs.FileSystem.get_file_info()

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/error.pxi:154](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/error.pxi#line=153), in pyarrow.lib.pyarrow_internal_check_status()

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/error.pxi:88](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/error.pxi#line=87), in pyarrow.lib.check_status()

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/_fs.pyx:1501](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/_fs.pyx#line=1500), in pyarrow._fs._cb_get_file_info()

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/fs.py:335](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/fs.py#line=334), in FSSpecHandler.get_file_info(self, paths)
    333 for path in paths:
    334     try:
--> 335         info = self.fs.info(path)
    336     except FileNotFoundError:
    337         infos.append(FileInfo(path, FileType.NotFound))

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/asyn.py:118](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/asyn.py#line=117), in sync_wrapper.<locals>.wrapper(*args, **kwargs)
    115 @functools.wraps(func)
    116 def wrapper(*args, **kwargs):
    117     self = obj or args[0]
--> 118     return sync(self.loop, func, *args, **kwargs)

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/asyn.py:103](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/asyn.py#line=102), in sync(loop, func, timeout, *args, **kwargs)
    101     raise FSTimeoutError from return_result
    102 elif isinstance(return_result, BaseException):
--> 103     raise return_result
    104 else:
    105     return return_result

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/asyn.py:56](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/asyn.py#line=55), in _runner(event, coro, result, timeout)
     54     coro = asyncio.wait_for(coro, timeout=timeout)
     55 try:
---> 56     result[0] = await coro
     57 except Exception as ex:
     58     result[0] = ex

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/s3fs/core.py:1371](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/s3fs/core.py#line=1370), in S3FileSystem._info(self, path, bucket, key, refresh, version_id)
   1369 if key:
   1370     try:
-> 1371         out = await self._call_s3(
   1372             "head_object",
   1373             self.kwargs,
   1374             Bucket=bucket,
   1375             Key=key,
   1376             **version_id_kw(version_id),
   1377             **self.req_kw,
   1378         )
   1379         return {
   1380             "ETag": out.get("ETag", ""),
   1381             "LastModified": out.get("LastModified", ""),
   (...)
   1387             "ContentType": out.get("ContentType"),
   1388         }
   1389     except FileNotFoundError:

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/s3fs/core.py:362](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/s3fs/core.py#line=361), in S3FileSystem._call_s3(self, method, *akwarglist, **kwargs)
    360 logger.debug("CALL: %s - %s - %s", method.__name__, akwarglist, kw2)
    361 additional_kwargs = self._get_s3_method_kwargs(method, *akwarglist, **kwargs)
--> 362 return await _error_wrapper(
    363     method, kwargs=additional_kwargs, retries=self.retries
    364 )

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/s3fs/core.py:142](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/s3fs/core.py#line=141), in _error_wrapper(func, args, kwargs, retries)
    140         err = e
    141 err = translate_boto_error(err)
--> 142 raise err

PermissionError: Forbidden
aderrien7 commented 6 months ago

It looks like an issue with the permissions indeed. Can you access gfts-ifremer bucket? Tag data are stored there

annefou commented 6 months ago

I can list the bucket, for instance:

s3.ls("gfts-ifremer/tags/tracks/AD_A11177

So I am not sure what permissions is required.

aderrien7 commented 6 months ago

Could you try this to just open this one file to see if you can access it :

import pandas as pd
pd.read_csv(s3.open("gfts-ifremer/tags/cleaned/AD_A11177/dst.csv"))
annefou commented 6 months ago

Could you try this to just open this one file to see if you can access it :

import pandas as pd
pd.read_csv(s3.open("gfts-ifremer/tags/cleaned/AD_A11177/dst.csv"))

Yes I tried and as I said, I can list files but I cannot read/access them. (I get permission denied).

annefou commented 6 months ago

@minrk should we all be added as ifremer users to access the data shared by ifremer?

minrk commented 6 months ago

You do have permission on the bucket (try reading gfts-ifremer/test.txt), but I think these objects have been created with more restricted permissions than the default for the bucket.

I can try to change the bucket policy to fix it, but in the meantime, I think the creator of the file can change the acl on the uploaded object.

tinaok commented 6 months ago

You do have permission on the bucket (try reading gfts-ifremer/test.txt), but I think these objects have been created with more restricted permissions than the default for the bucket.

I can try to change the bucket policy to fix it, but in the meantime, I think the creator of the file can change the acl on the uploaded object.

We used normal xarray_to_zarr, and didn't do anything special. We do not know what kind of option we should give so that other people can read.

minrk commented 6 months ago

I believe I've fixed this with user policies in #30

aderrien7 commented 6 months ago

Hi @annefou, I've just returned from vacation. From what I understand you were unable to access the files due to the usage rules used when I placed the files in the bucket, have you managed to overcome this issue since last time?

annefou commented 6 months ago

Hi @annefou, I've just returned from vacation. From what I understand you were unable to access the files due to the usage rules used when I placed the files in the bucket, have you managed to overcome this issue since last time?

Min fixed the permission issues.

I just fixed the order of the python import (from pre-commit).

Panel does not show up in the jupyterhub. Does it work for you?

aderrien7 commented 6 months ago

Hi @annefou, I've just returned from vacation. From what I understand you were unable to access the files due to the usage rules used when I placed the files in the bucket, have you managed to overcome this issue since last time?

Min fixed the permission issues.

I just fixed the order of the python import (from pre-commit).

Panel does not show up in the jupyterhub. Does it work for you?

Yes it was working using jupyterhub from here : https://gfts.minrk.net I belive you are using it here too ?

annefou commented 6 months ago

Yes it was working using jupyterhub from here : https://gfts.minrk.net I belive you are using it here too ?

Ok. It did not work for me but I think it is good to go.