Currently every processor in DART reads the entire observation sequence into memory.
total memory = obs_seq_size * num_procs
Fig from Kamil Yousuf:
In addition, the obs sequence reads and writes are single processor, which anti-scales.
This is no longer sufficient
increasing number of observations (e.g. satellite obs)
high resolution DA (large core count, wasting cycles with singe core IO)
particularly when obs sequence contain external forward operators (more per core memory). Side note: the obs sequence is maybe not the place to read/write external FOs, but that is the current design.
AI models (may want these to be subroutine callable and run many windows in one filter run)
Kamil Yousuf, Rhodes College SiParCS worked on reading obs sequences for multiple time windows: ~1/2 billion observations read and distributed. Kamil also has as parallel sort, and is working on parallel writes.
Kamil is assuming that the observation length is calculable (calculatable?, predictable), which is not guaranteed in general currently (but can be).
Kamil's fork https://github.com/tyiop794/DART (also has obs seq test harness)
Currently every processor in DART reads the entire observation sequence into memory.
Fig from Kamil Yousuf:
In addition, the obs sequence reads and writes are single processor, which anti-scales.
This is no longer sufficient
Kamil Yousuf, Rhodes College SiParCS worked on reading obs sequences for multiple time windows: ~1/2 billion observations read and distributed. Kamil also has as parallel sort, and is working on parallel writes. Kamil is assuming that the observation length is calculable (calculatable?, predictable), which is not guaranteed in general currently (but can be). Kamil's fork https://github.com/tyiop794/DART (also has obs seq test harness)
Folder in Specs for Obs_Seq_IO