Closed andrewtarzia closed 2 years ago
Hi @andrewtarzia
For a simple Convolutional Neural Network, defined in cnn_config.yaml in recognition module, it takes on average 50 minutes for RGB representation training on Macbook Pro 16 (2019) on dataset of 37,000 files with 500 data points (scaled down to 50x50 RGB plots). This time doesn't include the RGB dataset generation which takes on average 52 minutes for 40,000 files. It should be noted that tensorflow used for this package at the time of submission didn't support GPU fo Mac and relied on CPU cores for training and evaluation.
For utilisation of hyperparameter tuning we leveraged the NVIDIA TESLA P100 GPU available on HYAK Supercomputing facility available at University of Washington (UW). The average time taken for training and validation for RGB was 109.5 minutes per transformation listed in scattering_tform_config.yaml in arbitrage module. The tuner configuration used is available as tuner_config.yaml in recognition module for this search. In comparison, the average time taken for cartesian coordinate plot was 130.5 for the same configuration.
It should be noted that for tuning, bayesian search was unavailable in Keras Tuner at the time of submission which can surely speed up the process. Another factor that contributed with searching best hyperparameters was memory. The maximum memory available on single node on HYAK was 120 Gb. For RGB, 16 transformations were split into 2 sets to avoid memory runout error. However, for cartesian representation, the 16 transformations had to be divided into 3 set for the same scale factor. This implies that RGB representation, in addition to time efficient, is also memory efficient.
I hope this information addresses your queries.
Thank you for this!
Your Use-Case Hello, I am reviewing this package for JOSS and also imagine a use-case in our lab (at Imperial College) on classifying different spectra in high-throughput automated synthesis.
This discussion, or issue, is just to get a guide-line for the compute time and requirements to an experimental lab. I.e. how long did the training take for the set in the JOSS paper?