MSKCC-Computational-Pathology / MIL-nature-medicine-2019

342 stars 104 forks source link

Problem about dataset dimensions #12

Closed franciszchen closed 4 years ago

franciszchen commented 4 years ago

In the RNN_train.py, assume s as 10, the out of getitem() in rnndata should be [10, 3, 224, 224] for each WSI. image In the dataloader, for the batch size of 128, the inputs in each loop of train_single() should be [128, 10, 3, 224, 224], image

according to the code, the batch_size is 10 (it should be 128), and len(inputs) is 128 (it should be 10). These two variables are disordered. Just to verify this case.

franciszchen commented 4 years ago

I see it, this is because of collate_fn mechanism. The reference can be found at https://stackoverflow.com/questions/52818145/why-pytorch-dataloader-behaves-differently-on-numpy-array-and-list.