leap-stc / data-management

Collection of code to manually populate the persistent cloud bucket with data
https://catalog.leap.columbia.edu/
Apache License 2.0
0 stars 6 forks source link

Add eNATL recipe #75

Open jbusecke opened 9 months ago

jbusecke commented 9 months ago

Towards #73

DO NOT MERGE AS IS. HIGHLY EXPERIMENTAL

jbusecke commented 9 months ago

pre-commit.ci autofix

jbusecke commented 9 months ago

@cisaacstern this failed with the same error I encountered earlier. Could you take a look at this?

jbusecke commented 9 months ago

Trying out https://github.com/pangeo-forge/deploy-recipe-action/pull/27 There is probably a better way to manage this, but lets see.

jbusecke commented 9 months ago

Ok I was able to deploy this using the super hacky changes made in https://github.com/pangeo-forge/deploy-recipe-action/pull/27.

I will check later if the dataflow job) successfully ran

But maybe more important, we need to wait here how the discussion over at https://github.com/pangeo-forge/deploy-recipe-action/pull/27 goes. Sorry for the delay.

jbusecke commented 9 months ago

pre-commit.ci autofix

jbusecke commented 9 months ago

Yay, this worked!

image

@auraoupa do you have access to the leap hub? You can inspect the dataset with the following snippet:

import xarray as xr
path = 'gs://leap-persistent-ro/data-library/enatl60-blbt02-595733423-7175544257-1/eNATL60_BLBT02.zarr'
ds = xr.open_dataset(path, engine='zarr', chunks={})
ds

Two things I noticed:

jbusecke commented 9 months ago

There is something even weirder going on around the halo land values:

ds = ds.where(abs(ds)<1e20)
ds['vosaline'].isel(time_counter=0)

gives me:

image

What is the best way to get rid of those low values on land in a reliable way?

jbusecke commented 9 months ago

Finally we should think about the chunking of the final product. These are all things we can/should discuss before we have the deployment figured out.

auraoupa commented 9 months ago

Than you @jbusecke for advancing so quickly on this ! About your remarks :

jbusecke commented 9 months ago

Thanks for the quick response @auraoupa.

Lets deal with the most challenging issue first:

there are 2 masking values inside the variable, one is for land when the computing processor was all land (1e+20) and one is for land when the computing processor had both ocean and land (0). We handle this by using the official mask from this file which is tmask for T variables, umask and vmask for U, V etc ... Maybe I should have processed the data in advance so it is already masked with NaN in the proper areas, can it be done via pangeo-forge recipe ? Or do we upload the mask and grid files alongside the data ?

I think ideally each file would contain the masks as coordinates, then we could apply the masking on each file, and also retain the masks in the final output (this might be very important for budget analysis etc).

I have raised https://github.com/pangeo-forge/pangeo-forge-recipes/issues/663 to discuss this more broadly. Just as a heads up, this will probably not move before next week earliest, since folks are at AGU.

about the time index name, time_counter or time, both work for me

I already renamed to 'time' hehe.

about the chunking, I usually do something like : {'time_counter':1, 'x':1000,'y':1000} but you can adjust it if you feel like it has to be bigger or smaller

That seems fairly small to me. I would aim for chunksizes in the 100-200MB range, but this is a detail we can discuss at the end.

auraoupa commented 8 months ago

Hi @jbusecke, I hope you had a nice end of 2023 and wish you the best for 2024 ! I reprocessed my dataset so that the land-mask values are uniform and corrected at the same time the nav_lat and nav_lon coordinates (there were missing data on land), maybe it could be faster processing these instead of trying to do it with pangeo-forge ? I just added the new name and zenodo record in the eNATL60 feedstock

jbusecke commented 8 months ago

Thanks for doing this @auraoupa. This might unblock us here. I will keep track of it over at pgf.

jbusecke commented 8 months ago

Seems like we are getting an error that time_counter is not found. I suspect that was renamed @auraoupa? Ill check on that quickly.

jbusecke commented 8 months ago

Ok can confirm the dimension is now named 'time':

image
jbusecke commented 8 months ago

@auraoupa should 't_mask' be dependent on time? That seems like an error to me. Its easily fixable though!

jbusecke commented 8 months ago

pre-commit.ci autofix

jbusecke commented 8 months ago

Well thats a new one (cc @cisaacstern ):

Error message from worker: Traceback (most recent call last):
  File "apache_beam/runners/common.py", line 1435, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 851, in apache_beam.runners.common.PerWindowInvoker.invoke_process
  File "apache_beam/runners/common.py", line 997, in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
  File "/usr/local/lib/python3.10/dist-packages/apache_beam/transforms/core.py", line 1961, in <lambda>
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/pangeo_forge_recipes/aggregation.py", line 285, in schema_to_zarr
    ds.to_zarr(target_store, mode="w", compute=False)
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/xarray/core/dataset.py", line 2521, in to_zarr
    return to_zarr(  # type: ignore[call-overload,misc]
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/xarray/backends/api.py", line 1832, in to_zarr
    dump_to_store(dataset, zstore, writer, encoding=encoding)
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/xarray/backends/api.py", line 1362, in dump_to_store
    store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims)
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/xarray/backends/zarr.py", line 657, in store
    self.set_variables(
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/xarray/backends/zarr.py", line 779, in set_variables
    writer.add(v.data, zarr_array, region)
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/xarray/backends/common.py", line 241, in add
    target[region] = source
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/zarr/core.py", line 1495, in __setitem__
    self.set_orthogonal_selection(pure_selection, value, fields=fields)
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/zarr/core.py", line 1682, in set_orthogonal_selection
    indexer = OrthogonalIndexer(selection, self)
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/zarr/indexing.py", line 620, in __init__
    dim_indexer = SliceDimIndexer(dim_sel, dim_len, dim_chunk_len)
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/zarr/indexing.py", line 182, in __init__
    self.nchunks = ceildiv(self.dim_len, self.dim_chunk_len)
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/zarr/indexing.py", line 167, in ceildiv
    return math.ceil(a / b)
ZeroDivisionError: division by zero

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/apache_beam/runners/worker/sdk_worker.py", line 300, in _execute
    response = task()
  File "/usr/local/lib/python3.10/site-packages/apache_beam/runners/worker/sdk_worker.py", line 375, in <lambda>
    lambda: self.create_worker().do_instruction(request), request)
  File "/usr/local/lib/python3.10/site-packages/apache_beam/runners/worker/sdk_worker.py", line 639, in do_instruction
    return getattr(self, request_type)(
  File "/usr/local/lib/python3.10/site-packages/apache_beam/runners/worker/sdk_worker.py", line 677, in process_bundle
    bundle_processor.process_bundle(instruction_id))
  File "/usr/local/lib/python3.10/site-packages/apache_beam/runners/worker/bundle_processor.py", line 1113, in process_bundle
    input_op_by_transform_id[element.transform_id].process_encoded(
  File "/usr/local/lib/python3.10/site-packages/apache_beam/runners/worker/bundle_processor.py", line 237, in process_encoded
    self.output(decoded_value)
  File "apache_beam/runners/worker/operations.py", line 570, in apache_beam.runners.worker.operations.Operation.output
  File "apache_beam/runners/worker/operations.py", line 572, in apache_beam.runners.worker.operations.Operation.output
  File "apache_beam/runners/worker/operations.py", line 263, in apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
  File "apache_beam/runners/worker/operations.py", line 266, in apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
  File "apache_beam/runners/worker/operations.py", line 953, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/worker/operations.py", line 954, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/common.py", line 1437, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 1526, in apache_beam.runners.common.DoFnRunner._reraise_augmented
  File "apache_beam/runners/common.py", line 1435, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 636, in apache_beam.runners.common.SimpleInvoker.invoke_process
  File "apache_beam/runners/common.py", line 1621, in apache_beam.runners.common._OutputHandler.handle_process_outputs
  File "apache_beam/runners/common.py", line 1734, in apache_beam.runners.common._OutputHandler._write_value_to_tag
  File "apache_beam/runners/worker/operations.py", line 266, in apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
  File "apache_beam/runners/worker/operations.py", line 953, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/worker/operations.py", line 954, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/common.py", line 1437, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 1526, in apache_beam.runners.common.DoFnRunner._reraise_augmented
  File "apache_beam/runners/common.py", line 1435, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 636, in apache_beam.runners.common.SimpleInvoker.invoke_process
  File "apache_beam/runners/common.py", line 1621, in apache_beam.runners.common._OutputHandler.handle_process_outputs
  File "apache_beam/runners/common.py", line 1734, in apache_beam.runners.common._OutputHandler._write_value_to_tag
  File "apache_beam/runners/worker/operations.py", line 266, in apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
  File "apache_beam/runners/worker/operations.py", line 953, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/worker/operations.py", line 954, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/common.py", line 1437, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 1526, in apache_beam.runners.common.DoFnRunner._reraise_augmented
  File "apache_beam/runners/common.py", line 1435, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 851, in apache_beam.runners.common.PerWindowInvoker.invoke_process
  File "apache_beam/runners/common.py", line 995, in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
  File "apache_beam/runners/common.py", line 1621, in apache_beam.runners.common._OutputHandler.handle_process_outputs
  File "apache_beam/runners/common.py", line 1734, in apache_beam.runners.common._OutputHandler._write_value_to_tag
  File "apache_beam/runners/worker/operations.py", line 352, in apache_beam.runners.worker.operations.GeneralPurposeConsumerSet.receive
  File "apache_beam/runners/worker/operations.py", line 951, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/worker/operations.py", line 953, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/worker/operations.py", line 954, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/common.py", line 1437, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 1547, in apache_beam.runners.common.DoFnRunner._reraise_augmented
  File "apache_beam/runners/common.py", line 1435, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 851, in apache_beam.runners.common.PerWindowInvoker.invoke_process
  File "apache_beam/runners/common.py", line 997, in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
  File "/usr/local/lib/python3.10/dist-packages/apache_beam/transforms/core.py", line 1961, in <lambda>
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/pangeo_forge_recipes/aggregation.py", line 285, in schema_to_zarr
    ds.to_zarr(target_store, mode="w", compute=False)
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/xarray/core/dataset.py", line 2521, in to_zarr
    return to_zarr(  # type: ignore[call-overload,misc]
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/xarray/backends/api.py", line 1832, in to_zarr
    dump_to_store(dataset, zstore, writer, encoding=encoding)
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/xarray/backends/api.py", line 1362, in dump_to_store
    store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims)
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/xarray/backends/zarr.py", line 657, in store
    self.set_variables(
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/xarray/backends/zarr.py", line 779, in set_variables
    writer.add(v.data, zarr_array, region)
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/xarray/backends/common.py", line 241, in add
    target[region] = source
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/zarr/core.py", line 1495, in __setitem__
    self.set_orthogonal_selection(pure_selection, value, fields=fields)
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/zarr/core.py", line 1682, in set_orthogonal_selection
    indexer = OrthogonalIndexer(selection, self)
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/zarr/indexing.py", line 620, in __init__
    dim_indexer = SliceDimIndexer(dim_sel, dim_len, dim_chunk_len)
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/zarr/indexing.py", line 182, in __init__
    self.nchunks = ceildiv(self.dim_len, self.dim_chunk_len)
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/zarr/indexing.py", line 167, in ceildiv
    return math.ceil(a / b)
ZeroDivisionError: division by zero [while running 'Create|OpenURLWithFSSpec|OpenWithXarray|Preprocess|StoreToZarr/StoreToZarr/PrepareZarrTarget/Map(schema_to_zarr)-ptransform-42']

Let me change the target_chunks to see if this goes away

jbusecke commented 8 months ago

Ok now I am getting yet another error that I cannot quite grok:

Error message from worker: Traceback (most recent call last):
  File "apache_beam/runners/common.py", line 1435, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 851, in apache_beam.runners.common.PerWindowInvoker.invoke_process
  File "apache_beam/runners/common.py", line 995, in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
  File "apache_beam/runners/common.py", line 1611, in apache_beam.runners.common._OutputHandler.handle_process_outputs
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/pangeo_forge_recipes/rechunking.py", line 74, in split_fragment
    raise ValueError("A dimsize of 0 means that this fragment has not been properly indexed.")
ValueError: A dimsize of 0 means that this fragment has not been properly indexed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/apache_beam/runners/worker/sdk_worker.py", line 300, in _execute
    response = task()
  File "/usr/local/lib/python3.10/site-packages/apache_beam/runners/worker/sdk_worker.py", line 375, in <lambda>
    lambda: self.create_worker().do_instruction(request), request)
  File "/usr/local/lib/python3.10/site-packages/apache_beam/runners/worker/sdk_worker.py", line 639, in do_instruction
    return getattr(self, request_type)(
  File "/usr/local/lib/python3.10/site-packages/apache_beam/runners/worker/sdk_worker.py", line 677, in process_bundle
    bundle_processor.process_bundle(instruction_id))
  File "/usr/local/lib/python3.10/site-packages/apache_beam/runners/worker/bundle_processor.py", line 1113, in process_bundle
    input_op_by_transform_id[element.transform_id].process_encoded(
  File "/usr/local/lib/python3.10/site-packages/apache_beam/runners/worker/bundle_processor.py", line 237, in process_encoded
    self.output(decoded_value)
  File "apache_beam/runners/worker/operations.py", line 570, in apache_beam.runners.worker.operations.Operation.output
  File "apache_beam/runners/worker/operations.py", line 572, in apache_beam.runners.worker.operations.Operation.output
  File "apache_beam/runners/worker/operations.py", line 263, in apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
  File "apache_beam/runners/worker/operations.py", line 266, in apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
  File "apache_beam/runners/worker/operations.py", line 953, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/worker/operations.py", line 954, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/common.py", line 1437, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 1526, in apache_beam.runners.common.DoFnRunner._reraise_augmented
  File "apache_beam/runners/common.py", line 1435, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 851, in apache_beam.runners.common.PerWindowInvoker.invoke_process
  File "apache_beam/runners/common.py", line 995, in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
  File "apache_beam/runners/common.py", line 1621, in apache_beam.runners.common._OutputHandler.handle_process_outputs
  File "apache_beam/runners/common.py", line 1734, in apache_beam.runners.common._OutputHandler._write_value_to_tag
  File "apache_beam/runners/worker/operations.py", line 266, in apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive
  File "apache_beam/runners/worker/operations.py", line 953, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/worker/operations.py", line 954, in apache_beam.runners.worker.operations.DoOperation.process
  File "apache_beam/runners/common.py", line 1437, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 1547, in apache_beam.runners.common.DoFnRunner._reraise_augmented
  File "apache_beam/runners/common.py", line 1435, in apache_beam.runners.common.DoFnRunner.process
  File "apache_beam/runners/common.py", line 851, in apache_beam.runners.common.PerWindowInvoker.invoke_process
  File "apache_beam/runners/common.py", line 995, in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
  File "apache_beam/runners/common.py", line 1611, in apache_beam.runners.common._OutputHandler.handle_process_outputs
  File "/opt/apache/beam-venv/beam-venv-worker-sdk-0-0/lib/python3.10/site-packages/pangeo_forge_recipes/rechunking.py", line 74, in split_fragment
    raise ValueError("A dimsize of 0 means that this fragment has not been properly indexed.")
ValueError: A dimsize of 0 means that this fragment has not been properly indexed. [while running 'Create|OpenURLWithFSSpec|OpenWithXarray|Preprocess|StoreToZarr/StoreToZarr/Rechunk/FlatMap(split_fragment)-ptransform-35']

@cisaacstern could we dig into this in the coming days? Sorry this will still be blocked for now @auraoupa.

auraoupa commented 8 months ago

Seems like we are getting an error that time_counter is not found. I suspect that was renamed @auraoupa? Ill check on that quickly.

Yes forgot that, it is time now.

@auraoupa should 't_mask' be dependent on time? That seems like an error to me. Its easily fixable though!

No it is not dependent on time indeed, sorry I missed it

Thanks and good luck with the unusual errors ...

SammyAgrawal commented 3 months ago

Fixing:

jbusecke commented 3 months ago

@SammyAgrawal can you move any further discussion to https://github.com/leap-stc/eNATL_feedstock and close this?