NCAR / cesm-lens-aws

Examples of analysis of CESM LENS data publicly available on Amazon S3 (us-west-2 region) using xarray and dask
https://doi.org/10.26024/wt24-5j82
BSD 3-Clause "New" or "Revised" License
43 stars 23 forks source link

PermissionError: Access Denied when reading data from the ncar-cesm-lens s3 bucket #2

Closed andersy005 closed 5 years ago

andersy005 commented 5 years ago
In [16]: import s3fs

In [17]: fs = s3fs.S3FileSystem(anon=True)

In [18]: fs.ls('ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr')
Out[18]:
['ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/.zattrs',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/.zgroup',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/.zmetadata',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/BSW',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/DZSOI',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/HKSAT',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/RAIN',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/SUCSAT',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/WATSAT',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/ZSOI',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/area',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/date_written',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/landfrac',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/landmask',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/lat',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/levgrnd',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/levlak',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/lon',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/mcdate',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/mcsec',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/mdcur',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/member_id',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/mscur',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/nstep',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/pftmask',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/time',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/time_bounds',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/time_written',
 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/topo']

In [19]: store = s3fs.S3Map(root='ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr', s3=fs, check=False)

In [20]: import xarray as xr

In [21]: ds = xr.open_zarr(store, consolidated=True)
---------------------------------------------------------------------------
ClientError                               Traceback (most recent call last)
~/opt/miniconda3/envs/analysis/lib/python3.7/site-packages/s3fs/core.py in _fetch_range(client, bucket, key, version_id, start, end, max_attempts, req_kw)
   1154                                      Range='bytes=%i-%i' % (start, end - 1),
-> 1155                                      **kwargs)
   1156             return resp['Body'].read()

~/opt/miniconda3/envs/analysis/lib/python3.7/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
    356             # The "self" in this scope is referring to the BaseClient.
--> 357             return self._make_api_call(operation_name, kwargs)
    358

~/opt/miniconda3/envs/analysis/lib/python3.7/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
    660             error_class = self.exceptions.from_code(error_code)
--> 661             raise error_class(parsed_response, operation_name)
    662         else:

ClientError: An error occurred (AccessDenied) when calling the GetObject operation: Access Denied

During handling of the above exception, another exception occurred:

PermissionError                           Traceback (most recent call last)
~/opt/miniconda3/envs/analysis/lib/python3.7/site-packages/fsspec/mapping.py in __getitem__(self, key, default)
     73         try:
---> 74             result = self.fs.cat(key)
     75         except:

~/opt/miniconda3/envs/analysis/lib/python3.7/site-packages/fsspec/spec.py in cat(self, path)
    543         """ Get the content of a file """
--> 544         return self.open(path, 'rb').read()
    545

~/opt/miniconda3/envs/analysis/lib/python3.7/site-packages/fsspec/spec.py in read(self, length)
   1059         logger.debug("%s read: %i - %i" % (self, self.loc, self.loc + length))
-> 1060         out = self.cache._fetch(self.loc, self.loc + length)
   1061         self.loc += len(out)

~/opt/miniconda3/envs/analysis/lib/python3.7/site-packages/fsspec/core.py in _fetch(self, start, end)
    483             # First read
--> 484             self.cache = self.fetcher(start, bend)
    485             self.start = start

~/opt/miniconda3/envs/analysis/lib/python3.7/site-packages/s3fs/core.py in _fetch_range(self, start, end)
   1042     def _fetch_range(self, start, end):
-> 1043         return _fetch_range(self.fs.s3, self.bucket, self.key, self.version_id, start, end)
   1044

~/opt/miniconda3/envs/analysis/lib/python3.7/site-packages/s3fs/core.py in _fetch_range(client, bucket, key, version_id, start, end, max_attempts, req_kw)
   1168                 return b''
-> 1169             raise translate_boto_error(e)
   1170         except Exception as e:

PermissionError: Access Denied

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-21-7ceb9e63dc77> in <module>
----> 1 ds = xr.open_zarr(store, consolidated=True)

~/opt/miniconda3/envs/analysis/lib/python3.7/site-packages/xarray/backends/zarr.py in open_zarr(store, group, synchronizer, chunks, decode_cf, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, consolidated, overwrite_encoded_chunks, **kwargs)
    554     zarr_store = ZarrStore.open_group(store, mode=mode,
    555                                       synchronizer=synchronizer,
--> 556                                       group=group, consolidated=consolidated)
    557     ds = maybe_decode_store(zarr_store)
    558

~/opt/miniconda3/envs/analysis/lib/python3.7/site-packages/xarray/backends/zarr.py in open_group(cls, store, mode, synchronizer, group, consolidated, consolidate_on_close)
    248         if consolidated:
    249             # TODO: an option to pass the metadata_key keyword
--> 250             zarr_group = zarr.open_consolidated(store, **open_kwargs)
    251         else:
    252             zarr_group = zarr.open_group(store, **open_kwargs)

~/opt/miniconda3/envs/analysis/lib/python3.7/site-packages/zarr/convenience.py in open_consolidated(store, metadata_key, mode, **kwargs)
   1180
   1181     # setup metadata sotre
-> 1182     meta_store = ConsolidatedMetadataStore(store, metadata_key=metadata_key)
   1183
   1184     # pass through

~/opt/miniconda3/envs/analysis/lib/python3.7/site-packages/zarr/storage.py in __init__(self, store, metadata_key)
   2455             d = store[metadata_key].decode()  # pragma: no cover
   2456         else:  # pragma: no cover
-> 2457             d = store[metadata_key]
   2458         meta = json_loads(d)
   2459

~/opt/miniconda3/envs/analysis/lib/python3.7/site-packages/fsspec/mapping.py in __getitem__(self, key, default)
     76             if default is not None:
     77                 return default
---> 78             raise KeyError(key)
     79         return result
     80

KeyError: 'ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr/.zmetadata'

In [22]: s3fs.__version__
Out[22]: '0.3.4'

In [23]: xr.__version__
Out[23]: '0.12.3'
In [24]: fs = s3fs.S3FileSystem(anon=False, profile_name='intake-esm-tester')

In [25]: store = s3fs.S3Map(root='ncar-cesm-lens/lnd/monthly/cesmLE-RCP85-RAIN.zarr', s3=fs, check=False)

In [26]: ds = xr.open_zarr(store, consolidated=True)

In [27]: ds
Out[27]:
<xarray.Dataset>
Dimensions:       (hist_interval: 2, lat: 192, levgrnd: 15, levlak: 10, lon: 288, member_id: 40, time: 1140)
Coordinates:
  * lat           (lat) float32 -90.0 -89.057594 -88.11518 ... 89.057594 90.0
  * levgrnd       (levgrnd) float32 0.007100635 0.027925 ... 21.32647 35.17762
  * levlak        (levlak) float32 0.05 0.6 2.1 4.6 ... 18.6 25.6 34.325 44.775
  * lon           (lon) float32 0.0 1.25 2.5 3.75 ... 355.0 356.25 357.5 358.75
  * member_id     (member_id) int64 1 2 3 4 5 6 7 ... 34 35 101 102 103 104 105
  * time          (time) object 2006-02-01 00:00:00 ... 2101-01-01 00:00:00
Dimensions without coordinates: hist_interval
Data variables:
    BSW           (levgrnd, lat, lon) float32 dask.array<shape=(15, 192, 288), chunksize=(15, 192, 288)>
    DZSOI         (levgrnd, lat, lon) float32 dask.array<shape=(15, 192, 288), chunksize=(15, 192, 288)>
    HKSAT         (levgrnd, lat, lon) float32 dask.array<shape=(15, 192, 288), chunksize=(15, 192, 288)>
    RAIN          (member_id, time, lat, lon) float32 dask.array<shape=(40, 1140, 192, 288), chunksize=(40, 12, 192, 288)>
    SUCSAT        (levgrnd, lat, lon) float32 dask.array<shape=(15, 192, 288), chunksize=(15, 192, 288)>
    WATSAT        (levgrnd, lat, lon) float32 dask.array<shape=(15, 192, 288), chunksize=(15, 192, 288)>
    ZSOI          (levgrnd, lat, lon) float32 dask.array<shape=(15, 192, 288), chunksize=(15, 192, 288)>
    area          (lat, lon) float32 dask.array<shape=(192, 288), chunksize=(192, 288)>
    date_written  (time) |S8 dask.array<shape=(1140,), chunksize=(12,)>
    landfrac      (lat, lon) float32 dask.array<shape=(192, 288), chunksize=(192, 288)>
    landmask      (lat, lon) float64 dask.array<shape=(192, 288), chunksize=(192, 288)>
    mcdate        (time) int32 dask.array<shape=(1140,), chunksize=(12,)>
    mcsec         (time) int32 dask.array<shape=(1140,), chunksize=(12,)>
    mdcur         (time) int32 dask.array<shape=(1140,), chunksize=(12,)>
    mscur         (time) int32 dask.array<shape=(1140,), chunksize=(12,)>
    nstep         (time) int32 dask.array<shape=(1140,), chunksize=(12,)>
    pftmask       (lat, lon) float64 dask.array<shape=(192, 288), chunksize=(192, 288)>
    time_bounds   (time, hist_interval) object dask.array<shape=(1140, 2), chunksize=(1140, 2)>
    time_written  (time) |S8 dask.array<shape=(1140,), chunksize=(12,)>
    topo          (lat, lon) float32 dask.array<shape=(192, 288), chunksize=(192, 288)>
Attributes:
    Conventions:                          CF-1.0
    Initial_conditions_dataset:           b.e11.B20TRC5CNBDRD.f09_g16.105.clm...
    NCO:                                  4.3.4
    PFT_physiological_constants_dataset:  pft-physiology.c110425.nc
    Surface_dataset:                      surfdata_0.9x1.25_simyr1850_c110921.nc
    case_title:                           UNSET
    comment:                              NOTE: None of the variables are wei...
    history:                              2019-06-26 13:35:39.024200 xarray.o...
    hostname:                             tcs
    nco_openmp_thread_number:             1
    revision_id:                          $Id: histFileMod.F90 40539 2012-09-...
    source:                               Community Land Model CLM4.0
    title:                                CLM History file information
    username:                             mudryk
    version:                              cesm1_1_1_alpha01g

Cc @jhamman

jhamman commented 5 years ago

My leading hypothesis is that not all fo the objects had their permissions set to fully open. There has also been some recent churn in s3fs but this seems more like a object read permission error.

jeffdlb commented 5 years ago

OK, so it turns out that to grant public access I needed to not only declare the bucket public but also to make every object in the bucket publicly readable. I have launched that process and it will need to churn through everything. If you happen to be online now and want to test, try atm/monthly/FLNS because I did that one separately first.

-Jeff DLB

On Sat, Aug 31, 2019 at 4:44 PM Joe Hamman notifications@github.com wrote:

My leading hypothesis is that not all fo the objects had their permissions set to fully open. There has also been some recent churn in s3fs but this seems more like a object read permission error.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NCAR/cesm-lens-aws/issues/2?email_source=notifications&email_token=ABF4W4X7RSTFBAPETIE45BDQHLJ4PA5CNFSM4ISUPKIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5TU4ZA#issuecomment-526863972, or mute the thread https://github.com/notifications/unsubscribe-auth/ABF4W4TF42GBCOS3QQZI6KDQHLJ4PANCNFSM4ISUPKIA .

jeffdlb commented 5 years ago

In retrospect, the make-objects-public step should have been obvious.

Status update: 90 minutes later, only 14% through the process. I should have done this in smaller chunks because I'm not sure what will happen if my laptop goes to sleep...

-Jeff

On Sun, Sep 1, 2019 at 10:24 AM Jeff de La Beaujardière < notifications@github.com> wrote:

OK, so it turns out that to grant public access I needed to not only declare the bucket public but also to make every object in the bucket publicly readable. I have launched that process and it will need to churn through everything. If you happen to be online now and want to test, try atm/monthly/FLNS because I did that one separately first.

-Jeff DLB

On Sat, Aug 31, 2019 at 4:44 PM Joe Hamman notifications@github.com wrote:

My leading hypothesis is that not all fo the objects had their permissions set to fully open. There has also been some recent churn in s3fs but this seems more like a object read permission error.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/NCAR/cesm-lens-aws/issues/2?email_source=notifications&email_token=ABF4W4X7RSTFBAPETIE45BDQHLJ4PA5CNFSM4ISUPKIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5TU4ZA#issuecomment-526863972 , or mute the thread < https://github.com/notifications/unsubscribe-auth/ABF4W4TF42GBCOS3QQZI6KDQHLJ4PANCNFSM4ISUPKIA

.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NCAR/cesm-lens-aws/issues/2?email_source=notifications&email_token=ABF4W4S6U6VCPJLKDAWTFJLQHPGDDA5CNFSM4ISUPKIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5UDNRA#issuecomment-526923460, or mute the thread https://github.com/notifications/unsubscribe-auth/ABF4W4RV4W3KTUTDBDPHXQTQHPGDDANCNFSM4ISUPKIA .

andersy005 commented 5 years ago

Thank you for taking care of this, @jeffdlb! I ran a few smoke tests, and this seems to be working

jeffdlb commented 5 years ago

Anderson-

atm/daily and atm/hourly6-1990-2005 are still not fully public. Will be finished this morning.

-Jeff

On Mon, Sep 2, 2019 at 11:31 PM Anderson Banihirwe notifications@github.com wrote:

Thank you for taking care of this, @jeffdlb! I ran a few smoke tests, and this seems to be working

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

jeffdlb commented 5 years ago

I believe all of the data objects are now public. Please let me know if you experience any difficulty.

-Jeff

On Tue, Sep 3, 2019 at 6:50 AM Jeff de La Beaujardiere jeffdlb@ucar.edu wrote:

Anderson-

atm/daily and atm/hourly6-1990-2005 are still not fully public. Will be finished this morning.

-Jeff

On Mon, Sep 2, 2019 at 11:31 PM Anderson Banihirwe notifications@github.com wrote:

Thank you for taking care of this, @jeffdlb! I ran a few smoke tests, and this seems to be working

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.