Open subatomicBMAN opened 4 years ago
Interesting... I think it would be worth a shot to implement this.
If the resulting processing speed is faster than running it entirely in system ram, then I think this is definitely worth it. This looks to be exactly what I was asking in #121
This looks like it would require an upgrade to Tensorflow 2
As far as I can tell it certainly would need an upgrade. I'm sure it would take at least some time to do it right, but the Tensorflow team themselves have published tooling around an easier migration from 1.14+ to 2.0.
See the following: https://www.tensorflow.org/guide/upgrade Applying the auto-upgrade might be enough to at least test the merit of this as an option before digging in and fully implementing.
It still might be worthwhile to build a standalone benchmark workload in 2.0 to test the capabilities and performance tax of UVM of course.
I haven't had much of a chance to delve into the codebase at large yet, but if lower-level Tensorflow functionality has been used to write custom training then there could be considerably more work. The deeper differences and suggested updates are discussed here: https://www.tensorflow.org/guide/migrate
This would be amazing. Thanks for looking into this.
As per these two commits to tensorflow:
https://github.com/tensorflow/tensorflow/commit/cd4f5840 https://github.com/tensorflow/tensorflow/commit/b1139814
Support has now been extended for CUDA Unified Virtual Memory. If the intent behind this repository is to allow for consumer-grade hardware to run this in a closer-to-deployed state, running on the GPU would be IDEAL. Unified Virtual Memory can allow system-level RAM to be consumed alongside GPU VRAM in order to accomodate larger in-memory constructs. This of course comes at a cost (seeing that system RAM is typically MUCH slower than VRAM) however with a small benchmark this could be weighed against the benefits of running in CUDA parallelization.
I am of the opinion that this would allow for a much larger vector of the target demographic to run this repository at more reasonable processing speeds.