Open thodson-usgs opened 1 year ago
Hmm, I was following the instructions from the README, but perhaps I should have created the issue before the PR
I've so far been unable to run this using pange-forge-runner --prune with Direct Runner.
The run spits out lots of output ending with grpc.FutureTimeoutError
.
Not sure if this a problem with a recipe, my environment, or my hardware.
Hey @thodson-usgs, taking a look at running your recipe locally. Are you on the ESIP slack?
@norlandrhagen, Yes, I'll follow up with you there.
For the record, you identified an error in my recipe; however, my run crashes before that point so I probably need to take a closer look at my configuration file.
Thanks!
Nice fix! I'm now running into:
AttributeError: 'ZipExtFile' object has no attribute 'size' [while running 'Create|OpenURLWithFSSpec|Preprocess|StoreToZarr/OpenURLWithFSSpec/MapWithConcurrencyLimit/open_url (max_concurrency=1)']
Interesting, I'd been getting an error about opening the zip, but not that one. In general, I've been testing on the several environments on hand: Ubuntu on WSL2, ESIP-nebari, and HPC. Each one gives a unique error...smells like an environment issue.
Next steps:
max_concurrency=1
and if that fails avoid fsspec
entirely and open the zip url directly with rioxarray.One question, what type of system are you testing with?
And thank you again, @norlandrhagen
Ah strange! Happy to help further. I'm on an m1 mac. I'm creating a conda/mamba env and installing pangeo-forge-recipes there + rioxarray.
Progress,
I don't understand why OpenURLWithFSSpec failed (this all worked fine when I tested with fsspec
), but I can open the zipped TIFs directly from rioxarray.
Now I get
AttributeError: 'Dataset' object has no attribute 'encode' [while running 'Create|Preprocess|StoreToZarr/Preprocess/Map(_preproc)']
Maybe it's time to wade a bit deeper into Beam...
I changed one line and now the recipe runs without error.
def _preproc(item: Indexed[T]) -> Indexed[T]:
to
def _preproc(item: Indexed[T]) -> Indexed[xr.Dataset]:
At the next pangeo-forge meeting I'll follow up on why fsspec
didn't work.
sorry about that title change foobar ☝️ @thodson-usgs 😆 I am going to try to run this on my cluster as a data point and was creating a ticket of a similar name in a different tab
@ranchodeluxe, this recipe was a bit of a test point for us as well. USGS has a legacy of zipping tiffs, and I was demonstrating that pangeo-forge could handle that pattern. We did get it working, but it might have exposed another bug (https://github.com/pangeo-forge/pangeo-forge-recipes/issues/659). And then I got sidetracked working on the flink runner. Feel free to close this.
name: Recipe about: Demonstrating pangeo forge pipeline to USGS. title: SSEBOP
Dataset
SSEBOP is an evapotranspiration dataset covering CONUS at 1km2-daily resolution.