Open norlandrhagen opened 6 days ago
Ohhhh that looks really cool! @dhruvbalwada might be interested in this? I guess this will work, but might not be as fast as on gcs. Wondering if we should have a badge for the 'cloud'? But either way, this would be dope to link in.
Would be great to link to this!
Great! @dhruvbalwada if you have some background on this dataset, do you have any interest in doing a bit of exploring on which of these Zarr stores would be useful? Seems like there are Zarr stores per variable as well as lagrangian vs eulerian versions.
@norlandrhagen I think all of these will potentially be useful. (This dataset is very complementary to a LLC4320 data that was made available through Pangeo, and has been used by many).
Is the discussion here to just provide a link to these datasets? or is something that will cost LEAP and so we have some resource constrain?
@dhruvbalwada the former. It will be very beneficial to get an idea how to present these stores in the catalog in a meaningful way.
Happy to help with that, let me know what you would like me to actually do.
Awesome! Thanks for the expertise @dhruvbalwada.
I think a good start would be to see if you can access / catalog these Zarr stores.
I think the data is here, but I haven't explored it yet.
Also might be some clues here.
The data producer / speaker, Shane Elipot, seems super nice and was eager to have people using his data. I bet you/we could reach out to him with questions.
I think ideally we have a table of Zarr stores we want to add to the catalog + some metadata.
ex:
|-------------------------------------------------------------------------------------
| dataset_name_variable. | zarr store link |
|-------------------------------------------------------------------------------------
| lagrangian_HYCOM_u_component | s3://../../lagrangian_HYCOM_u_component.zarr |
|-------------------------------------------------------------------------------------
| lagrangian_HYCOM_v_component | s3://../../lagrangian_HYCOM_v_component.zarr |
|-------------------------------------------------------------------------------------
Just played around with the data a bit, and wanted to note some points:
import s3fs
import xarray as xr
fs = s3fs.S3FileSystem(anon=True)
mapper = fs.get_mapper("s3://hycom-global-drifters/lagrangian/global_hycom_0m_step_1.zarr")
xr.open_dataset(mapper, engine='zarr')
but this doesnt:
xr.open_dataset("s3://hycom-global-drifters/lagrangian/global_hycom_0m_step_1.zarr", engine='zarr')
We might need a way for the catalog to add custom kwargs to the snippet due to this!
'hycom-global-drifters/lagrangian/',
'hycom-global-drifters/lagrangian/global_hycom_0m_step_1.zarr',
'hycom-global-drifters/lagrangian/global_hycom_0m_step_10.zarr',
'hycom-global-drifters/lagrangian/global_hycom_0m_step_11.zarr',
'hycom-global-drifters/lagrangian/global_hycom_0m_step_2.zarr',
'hycom-global-drifters/lagrangian/global_hycom_0m_step_3.zarr',
'hycom-global-drifters/lagrangian/global_hycom_0m_step_4.zarr',
'hycom-global-drifters/lagrangian/global_hycom_0m_step_5.zarr',
'hycom-global-drifters/lagrangian/global_hycom_0m_step_6.zarr',
'hycom-global-drifters/lagrangian/global_hycom_0m_step_7.zarr',
'hycom-global-drifters/lagrangian/global_hycom_0m_step_8.zarr',
'hycom-global-drifters/lagrangian/global_hycom_0m_step_9.zarr',
'hycom-global-drifters/lagrangian/global_hycom_15m_step_1.zarr',
'hycom-global-drifters/lagrangian/global_hycom_15m_step_10.zarr',
'hycom-global-drifters/lagrangian/global_hycom_15m_step_11.zarr',
'hycom-global-drifters/lagrangian/global_hycom_15m_step_2.zarr',
'hycom-global-drifters/lagrangian/global_hycom_15m_step_3.zarr',
'hycom-global-drifters/lagrangian/global_hycom_15m_step_4.zarr',
'hycom-global-drifters/lagrangian/global_hycom_15m_step_5.zarr',
'hycom-global-drifters/lagrangian/global_hycom_15m_step_6.zarr',
'hycom-global-drifters/lagrangian/global_hycom_15m_step_7.zarr',
'hycom-global-drifters/lagrangian/global_hycom_15m_step_8.zarr',
'hycom-global-drifters/lagrangian/global_hycom_15m_step_9.zarr'
* This dataset has a lot of different 'steps'. I have no clue if we could potentially virtually concatenate these?
This seems like a cool use case! Maybe we open up an issue in virtualizarr. It seems possible to merge the virtual zarrs.
I'm at the pangeo showcase talk. Shane Elipot has a massive public ocean model Zarr output on the AWS public data program. I think it's split into 12 separate Zarr stores.
https://github.com/selipot/hycom-oceantrack?tab=readme-ov-file
Wondering if LEAP folks would find this useful? @jbusecke