Closed tacazares closed 2 years ago
@FaizRizvi worked on speeding up the training ordered enqueuer and enabling multiprocessing. We were able to speed up training times as reported in pull request #106 . We also made general updates to the training parser, ROIgenerators, and added documentation for training functions.
We have had some long training times with maxATAC v1. I think we are not utilizing the multi-processing as effectively as we could. We have switched our training approach after several issues with older versions of tensorflow. Related issues: #28 #47
Currently, the training times are approaching ~1 hour per epoch, where historically they have been around 20 minutes or less. This was using 16 cores and 64GB of memory.
I think that we need to remove the
OrderedEnqueuer
, increase the number of workers, and integrate the data generator into theSeqDataGenerator
object.I tested a version that implements the above and achieved ~13 minutes per epoch. This method needs to be validated still.