ai2cm / fv3config

Manipulate FV3GFS run directories
Apache License 2.0
1 stars 0 forks source link

`write_run_directory` is broken in development version of fv3config #149

Closed spencerkclark closed 2 years ago

spencerkclark commented 2 years ago

When I try to create a run directory using this config with the development version of fv3config, I get this error message:

>>> import fv3config
>>> import yaml
>>> config = yaml.safe_load(open("default.yml", "r"))
>>> fv3config.write_run_directory(config, "rundir")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/spencerc/fv3config/fv3config/config/rundir.py", line 22, in write_run_directory
    write_assets_to_directory(config, target_directory)
  File "/home/spencerc/fv3config/fv3config/_asset_list.py", line 256, in write_assets_to_directory
    write_asset_list(asset_list, target_directory)
  File "/home/spencerc/fv3config/fv3config/_asset_list.py", line 262, in write_asset_list
    write_asset(asset, target_directory)
  File "/home/spencerc/fv3config/fv3config/_asset_list.py", line 227, in write_asset
    copy_file_asset(asset, target_path)
  File "/home/spencerc/fv3config/fv3config/_asset_list.py", line 244, in copy_file_asset
    filesystem.get_file(source_path, target_path)
  File "/home/spencerc/fv3config/fv3config/filesystem.py", line 119, in get_file
    _get_file_cached(source_filename, dest_filename)
  File "/home/spencerc/fv3config/fv3config/filesystem.py", line 133, in _get_file_cached
    os.makedirs(os.path.dirname(cache_location), exist_ok=True)
  File "/home/spencerc/miniconda3/envs/fv3net-testing/lib/python3.8/os.py", line 223, in makedirs
    mkdir(name, mode)
FileExistsError: [Errno 17] File exists: '/home/spencerc/.cache/fv3gfs/fv3config-cache/rel/gs/vcm-fv3config/data/initial_conditions/gfs_c12_example/v1.0'

In the past, writing a run directory with this config would work.

nbren12 commented 2 years ago

Couple things. 1) fv3config.load should be used instead of yaml.safe_load although I doubt this is causing your problem. 2) I'll try to reproduce the problem. In the mean time, did you try rming that directory and trying again?

nbren12 commented 2 years ago

I was able to reproduce with fsspec==2022.5.0 and using write_run_directory (which uses fv3config.load under the hood), so that eliminates #1.

spencerkclark commented 2 years ago

Sorry I'm stuck in old ways...I agree though I'm using the default diag_table which is not a dictionary, so fv3config.load should be equivalent to yaml.safe_load.

I did try removing the directory and ended up with the same result. I was also surprised that the location of the cache directory changed -- I normally have it set to a different location using the FV3CONFIG_CACHE_DIR environment variable.

nbren12 commented 2 years ago

Ok. I'll look into it. I was just modifying this code. Are you saying that FV3CONFIG_CACHE_DIR is not being used?

@mcgibbon What do you think about using https://filesystem-spec.readthedocs.io/en/latest/api.html#fsspec.implementations.cached.SimpleCacheFileSystem instead of rolling our own caching layer? Their solution may be more robust and integrate better with fsspec. Would be a one-liner to use it and then delete all our own caching.

spencerkclark commented 2 years ago

Thanks!

Are you saying that FV3CONFIG_CACHE_DIR is not being used?

Yes, that is what it seems like.

nbren12 commented 2 years ago

ok. that seems like an orthogonal bug. To the original bug, it seems like some of the parent directories are being created as files rather than directories:

$ file /home/noahb/.cache/fv3gfs/fv3config-cache/rel/gs/vcm-fv3config/data/initial_conditions/gfs_c12_example/v1.0
/home/noahb/.cache/fv3gfs/fv3config-cache/rel/gs/vcm-fv3config/data/initial_conditions/gfs_c12_example/v1.0: empty
nbren12 commented 2 years ago

More debugging: there appear to "assets" without source_name:

(Pdb) p asset_list[0]
{'source_location': 'gs://vcm-fv3config/data/initial_conditions/gfs_c12_example/v1.0', 'source_name': '', 'target_location': 'INPUT/', 'target_name': '', 'copy_method': 'copy'}
(Pdb) p asset_list[1]
{'source_location': 'gs://vcm-fv3config/data/initial_conditions/gfs_c12_example/v1.0', 'source_name': 'gfs_ctrl.nc', 'target_location': 'INPUT/', 'target_name': 'gfs_ctrl.nc', 'copy_method': 'copy'}