henrysky / astroNN

Deep Learning for Astronomers with Keras
http://astronn.readthedocs.io/
MIT License
193 stars 51 forks source link

Current .h5 dataset loading mechanism is problematic #3

Open henrysky opened 6 years ago

henrysky commented 6 years ago

Currently, this is viewed as a low priority performance related issue. Probably wont be fixed in near future

System information

Describe the problem

Current .h5 dataset loading mechanism is problematic due to the fact that astroNN load the whole dataset into memory regardless of the size. It will eventually be a serious problem if the dataset is too big and have too little memory (Already a little problem of loading APOGEE training data (~12GB on my 16GB RAM laptop and desktop)

Source code / logs

Irrelevant

Suggestion

Neural Network/Data generator should talk to H5Loader directly instead of H5Loader loads the whole dataset to memory to Neural Network/Data generator.