NCAR / DART

Data Assimilation Research Testbed
https://dart.ucar.edu/
Apache License 2.0
196 stars 145 forks source link

feat req: Scalable obs sequences #745

Open hkershaw-brown opened 1 month ago

hkershaw-brown commented 1 month ago

Currently every processor in DART reads the entire observation sequence into memory.

total memory = obs_seq_size * num_procs

Fig from Kamil Yousuf: Screenshot 2024-09-30 at 10 34 14 AM

In addition, the obs sequence reads and writes are single processor, which anti-scales.

This is no longer sufficient

Kamil Yousuf, Rhodes College SiParCS worked on reading obs sequences for multiple time windows: ~1/2 billion observations read and distributed. Kamil also has as parallel sort, and is working on parallel writes. Kamil is assuming that the observation length is calculable (calculatable?, predictable), which is not guaranteed in general currently (but can be). Kamil's fork https://github.com/tyiop794/DART (also has obs seq test harness)

Folder in Specs for Obs_Seq_IO