ai2cm / fv3config

Manipulate FV3GFS run directories
Apache License 2.0
1 stars 0 forks source link

fv3config does not work with user application credentials bind-mounted into a docker container #143

Closed nbren12 closed 2 years ago

nbren12 commented 2 years ago

I get the following complaint when running in docker with the ~/.config/gcloud bind-mounted into the container.

    pdb.run("write_run_directory()")
  File "/usr/lib/python3.6/pdb.py", line 1572, in run
    Pdb().run(statement, globals, locals)
  File "/usr/lib/python3.6/bdb.py", line 434, in run
    exec(cmd, globals, locals)
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/fv3config/cli.py", line 47, in write_run_directory
    fv3config.write_run_directory(config, args.rundir)
  File "/usr/local/lib/python3.6/dist-packages/fv3config/config/rundir.py", line 22, in write_run_directory
    write_assets_to_directory(config, target_directory)
  File "/usr/local/lib/python3.6/dist-packages/fv3config/_asset_list.py", line 255, in write_assets_to_directory
    asset_list = config_to_asset_list(config)
  File "/usr/local/lib/python3.6/dist-packages/fv3config/_asset_list.py", line 283, in config_to_asset_list
    asset_list += get_initial_conditions_asset_list(config)
  File "/usr/local/lib/python3.6/dist-packages/fv3config/_asset_list.py", line 64, in get_initial_conditions_asset_list
    source_directory = get_initial_conditions_directory(config)
  File "/usr/local/lib/python3.6/dist-packages/fv3config/_datastore.py", line 82, in get_initial_conditions_directory
    ensure_exists(config["initial_conditions"], "initial_conditions")
  File "/usr/local/lib/python3.6/dist-packages/fv3config/_datastore.py", line 88, in ensure_exists
    raise ConfigError(f"{location_name} location {location} does not exist")
fv3config._exceptions.ConfigError: initial_conditions location gs://vcm-fv3config/data/initial_conditions/gfs_c12_example/v1.0 does not exist

This error occurs because of the requester_pays flag here: https://github.com/ai2cm/fv3config/blob/544eaf1bc6f1c4617cd8ee6bd3298136ed180f4c/fv3config/filesystem.py#L25

See this debugger output:

(Pdb) p fsspec.filesystem('gs', requester_pays=True).exists(location)
False
(Pdb) p fsspec.filesystem('gs').exists(location)
True

I suggest removing this flag. It is not applicable in all environment and can be controlled by setting the

export FSSPEC_GS_REQUESTER_PAYS=on

if needed.

nbren12 commented 2 years ago

I think I'm running into this problem again.

nbren12 commented 2 years ago

I'm motivated to fix this since it consumed 45 minutes this morning for me. The current workaround is to set

FSSPEC_GS_PROJECT=vcm-ml
# or
GOOGLE_CLOUD_PROJECT=vcm-ml

in the image.

I think a couple options are to

Any preference @oliverwm1 @mcgibbon ? I'd vote for removing the requester_pays flag for simplicity. I think this would seamlessly support all the auth patterns we use on the teams. Any external users could then enable requestor pays by using FSSPEC_GS_ env vars.

oliverwm1 commented 2 years ago

I am +1 on removing the requester_pays=True default.

oliverwm1 commented 2 years ago

Maybe worth verifying that the FSSPEC_GS_REQUESTOR_PAYS env var configuration works as expected? I don't think I've ever used it.

nbren12 commented 2 years ago

Maybe worth verifying that the FSSPEC_GS_REQUESTOR_PAYS env var configuration works as expected? I don't think I've ever used it.

Looks like it works:

$ FSSPEC_GS_REQUESTER_PAYS=True python3 -c 'import fsspec; print( fsspec.filesystem("gs").requester_pays)'
True
$ FSSPEC_GS_REQUESTER_PAYS=False python3 -c 'import fsspec; print( fsspec.filesystem("gs").requester_pays)'
False