Closed hnguyentt closed 4 years ago
Thanks for so detailly reading my codes. The sampling stride was different for different types of data in our experiments, in order to keep numbers of different classes in the same order of magnitude. You can see in the paper that influenza data is so little and we had many COVID-19 and CAPs. The second reason is a parctical consideration that using all slices in training cost too much time and it influence the results slightly since the training slices we sampled and used were able to fit or almost overfit the network.
Do you think that applying stride = 10 will probably skip the important slices for diagnosis in the scans?
Sorry for no replying immediantly! Different types of data was sampled with different strides and so as to data of different centers. As a result, I can hardly remind why I used 10 but the question is really worth considerasion. Since we collected so many data that the influence is not crital, that the whole training process was similar to multi instance learnig with weak supervisoin.
Thank you for your answer.
In the file
data/get_train_jpgs.py
, why did you only use one every 10 slices in the CT scans for training but not all slices?