ratt-ru / dask-ms

Implementation of a dask/xarray dataset backed by a CASA MS
https://dask-ms.readthedocs.io
Other
19 stars 7 forks source link

Configure custom storage options if url matches configured prefix #237

Closed sjperkins closed 2 years ago

sjperkins commented 2 years ago
sjperkins commented 2 years ago

@o-smirnov @JSKenyon @landmanbester

This PR modifies dask-ms to use donfig for configuration options.

donfig searches for yaml configuration files in the following locations.

In particular, if we create the following yaml:

storage_options:
  s3://test-bucket-ee575396:
    client_kwargs:
      endpoint_url: https://127.0.0.1:9000
      region_name: af-south-1
      verify: false
    key: XXXXXX
    secret: XXXXXX

then any URL prefixed with s3://test-bucket-ee575396, for e.g. DaskMSStore("s3://test-bucket-ee575396/a/sub/directory") will result in the DaskMSStore being assigned those storage_options.

@o-smirnov I know you are concerned about credentials in the clear (https://github.com/ratt-ru/dask-ms/issues/234) but note that the AWS Command Line interface interacts with credentials stored in ~/.aws/credentials with user permissions only. See https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html for example. In fact, the test case passes if I remove key and secret from the above yaml and place them in ~/.aws/credentials/.

I'm therefore less concerned about storing credentials in a yaml file. Also note it's possible to create multiple yaml files: donfig will merge them into a unified configuration. This means that we could split sensitive and non-sensitive configuration, if necessary.

sjperkins commented 2 years ago

donfig also supports setting configuration via environment variables. I'm trying to work out if the above approach would work with them as urls contain invalid characters from a bash variable perspective.

JSKenyon commented 2 years ago

This looks good to me @sjperkins. Those of use who have used from_url_and_kw will just need to switch over when this is merged.

o-smirnov commented 2 years ago

Cool, looks nice and flexible, and easy to integrate with stimela.

JSKenyon commented 2 years ago

I have moved QuartiCal's main branch over to this new functionality (PR pending). One thing I ran into was that the omission of the 's' in 'https' caused things to break (port-forwarding came down) - just a cautionary tale for anyone else moving over.