Open mike-gangl opened 8 months ago
The testing numbers we're looking at are as follows (from @hookhua )
SBG TIR • forward processing
- ~6.2K PGE jobs per day
- ~44.6TB per day of L0 to L4 products
- ~148 parallel compute instances • bulk processing
- ~18.7K PGE jobs per day
- ~134TB per day of L0 to L4 products
- ~444 parallel compute instances • low-latency processing
- ~2.9K PGE jobs per day
- ~35.4TB per day of L0 to L4 products
- ~80 parallel compute instances SBG VSWIR • forward processing
- ~21.9K PGE jobs per day
- ~56.5TB per day of L0 to L2 products
- ~577 parallel compute instances • bulk processing
- ~65,6K PGE jobs per day
- ~169.5TB per day of L0 to L4 products
- ~1,731 parallel compute instances
For a single SBG workflow as built in our system (L0 -> L2A):
5 + 1 PGEs Data:
If we include the 'optional' veg-biochem workflow as well, we'd add an additional PGE
So for any product, we could run 5PGEs and move ~19GB through the system. This equates to:
TIR 1240 workflows 23,560GB of data
VSWIR 4380 workflows 83,220GB of data
Airflow Scaling improvements
Two major use cases to achieve here:
Essentially, an idle system should have no workers running.
Acceptance Criteria
Acceptance criteria required to implement the epic
Work Tickets
Link to work tickets required to implement the epic
Dependencies
Other epics or outside tickets required for this to work
Associated Risks
links to risk issues associated with this epic