GMP feature with lmdb - Githubissues

some major updates that we have:

- no-caching (load directly from disk, very slow and inefficient, but support infinite amount of training data, not recommended)
- full-caching (load lmdbs to memory first, very fast IO and training, tho amount of training data limited by CPU RAM, recommended if possible, ~100x faster than the no-caching way)
- partial-caching (partially load lmdb into memory during training, close to full-caching speed but support infinite amount of training data. ~2-3x slower than full-caching, but ~30-50x faster than no-caching)

relevant example files for preparing lmdb files, and training with them
Note: I have commented out all the pca dim-reduction codes(do pca after fps are calculated, to reduce dimensionality) for now, but kept them there in case we want them in the future.

ulissigroup / amptorch