numfocus / project-summit-2024-unconference

This repository will be used to gather unconference topics and facilitate voting on topics for the NumFOCUS 2024 Project Summit.
BSD 3-Clause "New" or "Revised" License
9 stars 1 forks source link

large data via chunking/sharding/cloud #20

Open vjcitn opened 2 months ago

vjcitn commented 2 months ago

open software for large data analysis tasks leads to problems of compute strategy and large data conveyance how does your project deal with this?

Session Notes

TomNicholas commented 2 months ago

This is literally what Zarr is made for, which is a numfocus-sponsored project! I'm happy to talk about how we use Zarr for climate science and oceanography data, and point you towards others who use Zarr in biomedical applications.

EDIT:

compute strategy

If you're also wondering about parallel execution frameworks I run a Pangeo working group on this, and would be happy to talk about all our experiences doing very large (TB- and PB-scale) array computations in the cloud.

vjcitn commented 2 months ago

great!

InessaPawson commented 2 months ago

Thanks for proposing the topic, @vjcitn !

This unconference is scheduled in Clara Barton Room at 5 pm ET.

For full schedule, visit: https://www.nfsummit24.com/schedule.