Closed Lut99 closed 2 weeks ago
ba37b51 should have done some significant fixes, but more work is needed. Next step would be to alias container spawning and dataset management, but we'd have to find some way of doing that elegantly.
Closing even though imperfect. We reached OK performance for the problematic use-case.
Even though Brane is a distributed platform and data transfers are expected to introduce some extra overhead, they are currently unreasonably slow. Even small datasets (like the weights in Saba's use-case) take up to minutes for each transfer.
This has definitely something to do with compression, which might not be parallelized and/or slow in the implementation we're using. Another quick fix might be to rework the VM a little to allow preprocessing to happen in parallel - or actually, investigate why this is happening.