PENGLU-WashU / IMC_Denoise

IMC_Denoise: a software package to enhance Imaging Mass Cytometry - Nature Communications
Other
58 stars 19 forks source link

Edge case in training batch generation #32

Open jiwen90 opened 5 months ago

jiwen90 commented 5 months ago

Currently, the number of steps per epoch is defined by https://github.com/PENGLU-WashU/IMC_Denoise/blob/bdf6ae07568b752c15d2fbe28e71bc3bff268cba/IMC_Denoise/IMC_Denoise_main/DeepSNiF.py#L217

However, there is an edge case when the number of training samples is divisible by 128 (batch size), which results in WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches, due to the floor and addition.

I wonder if it is necessary to include the last batch, which consists of ≤128 samples, and if very few samples in this batch could adversely affect loss for that last step. Then it would be appropriate to simply use floor. Otherwise, you can use ceiling to achieve the same logic without the edge case.

PENGLU-WashU commented 4 months ago

Thanks for this feedback. We will modify this later. Or if you are interested, you can make a pull request to correct this.