pangeo-forge / pangeo-forge-cloud-federation

Infrastructure for running pangeo-forge across multiple bakeries
Apache License 2.0
3 stars 6 forks source link

Need a persistent efs volume for storing Flink metadata #17

Closed thodson-usgs closed 6 months ago

thodson-usgs commented 7 months ago

To recover from job-manager failures, Flink needs a persistent file system for storing job metadata. This could be s3 or efs. Of those two, I have a slight preference for the latter, because USGS uses sessions, so the cluster might lose s3 access during the course of a long job.

I'm looking into how to add a persistent volume to our k8 deployment but keep finding myself down Medium rabbit holes.

yuvipanda commented 7 months ago

@thodson-usgs https://github.com/pangeo-forge/pangeo-forge-cloud-federation/pull/6 sets up an EFS that can be consumed this way! Maybe you can work with @ranchodeluxe on that?

thodson-usgs commented 7 months ago

Sure, I'll start rebasing that branch onto main

ranchodeluxe commented 6 months ago

@thodson-usgs: since that other PR merged do we still need this one? Or maybe a version of it?

thodson-usgs commented 6 months ago

Sure. I think we can close this.