HotspotStoplight / Climate

Apache License 2.0
0 stars 1 forks source link

Parallelize operations with Apache Beam #32

Closed nlebovits closed 1 week ago

nlebovits commented 7 months ago

Parallelizing operations with apache beam should allow us to work more efficiently.

This may involve the high volume endpoint; I'm not sure.

Note that our images are currently stored as COGs. Maybe switch to Zarr? Idk: https://zarr.readthedocs.io/en/stable/

See also: https://developers.google.com/earth-engine/guides/data_extraction

nlebovits commented 7 months ago

Yep--will have to follow the computePixels process here: https://developers.google.com/earth-engine/guides/data_extraction. Can be written as a GeoTIFF.

Here's an example notebook: https://colab.research.google.com/github/google/earthengine-community/blob/master/guides/linked/Earth_Engine_training_patches_computePixels.ipynb

And here's integration with Dataflow and Vertex AI: https://www.youtube.com/watch?v=2iiC1p69-EY

nlebovits commented 7 months ago

Here's the crucial notebook: https://colab.research.google.com/github/GoogleCloudPlatform/python-docs-samples/blob/main/people-and-planet-ai/land-cover-classification/cloud-tensorflow.ipynb

nlebovits commented 7 months ago

https://docs.google.com/presentation/d/1t-O15HdVeJITh9ejBqLwrHna_XCZSOgX13T9cYhacSA/edit#slide=id.p