Open nbren12 opened 4 years ago
Fix will be pretty simple, just have to change cache directory from /inputdata to something any user should be able to access like /tmp/inputdata
.
Edit: better yet /var/cache/inputdata
.
Edit2: or even symlink /inputdata
to /var/cache/inputdata
.
Note this bug won't actually occur if you build an image from the latest fv3gfs-python
master branch, because the new fv3config
changes have not yet been checked out in fv3gfs-python
. But this has to be fixed before then.
Sorry this caught you @nbren12. I pushed a fv3gfs-python image that was built my fix/oversubscribe
fv3config branch, which included the new caching changes. What's the best approach in terms of tagging/versioning images?
I just looked up the exact "digest" like this:
gcloud container images list-tags --format='json' us.gcr.io/vcm-ml/fv3gfs-python
And then used an image name like this:
docker run us.gcr.io/vcm-ml/fv3gfs-python@<digest>
That might be a good pattern going forward.
@mcgibbon Didn't the data used to go in appdirs
? That might a more robust cross platform solution than using folders like the ones you mention.
It still uses appdirs. The docker image just uses an environment variable to override the directory. The appdirs default is the user cache directory, so it would cause fv3run to fail (that was actually when we made the environment variable override).
I think this is part of a larger issue that we should start versioning our products in general. At least, fv3config and fv3gfs-python should probably have proper major.minor.bugfix
versions. Then fv3gfs-python images pushed by circleci would be named e.g. us.gcr.io/vcm-ml/fv3gfs-python:0.1.2
. When we make images manually we would push with some other tag (e.g. us.gcr.io/vcm-ml/fv3gfs-python:0.1.2-oliwm
).
If you want to pin a workflow to a particular image, using the digest is a good way to go.
After some changes @oliverwm1 pushed today, the GCR image is now broken. Here is a minimum working example:
@mcgibbon explained the problem on slack in this way