paulhager / MMCL-Tabular-Imaging

82 stars 13 forks source link

Questions on data preprocessing and batch size #4

Closed 201younghanlee closed 1 year ago

201younghanlee commented 1 year ago

Thanks for uploading Data folder, I have few more question which I would be so grateful if you could help me reproduce the paper.

1) what was the purpose of "check_or_save" function? (could you release the code for this function? ) 2) with nclasses = batch size, should there only be a single sample from a certain class in a batch ? 3) If so, how could the batch size be larger than the number of classes ? (e.g., DVM has 286 classes, and trained with 512 batch size) 4) what is the purpose of "indices" in clip loss ?

Thanks for your help !

paulhager commented 1 year ago
  1. It just double checks if the data is the same before saving to make sure no changes are accidentally made to the data splits during development.
  2. That nclasses is only used for calculating the accuracy of the projection matching task.
  3. It doesn't have anything to do with the downstream task so there shouldn't be any issues there
  4. The indices was used for alternative loss functions and isn't important.