ToniChopp / ECAMP

The official implementation of "ECAMP: Entity-centered Context-aware Medical Vision Language Pre-training"
MIT License
35 stars 2 forks source link

Some questions about the data processing #1

Closed Eldo-rado closed 9 months ago

Eldo-rado commented 9 months ago

👋 Hi! Thank you for your contribution, it is really a great job. And I have two questions regarding data processing:

  1. RSNA Pneumonia: In the paper, it is mentioned, "The official data split is followed, with the training/validation/test set consisting of 25,184/1,500/3,000 images, respectively." I checked the RSNA Pneumonia dataset's official website and found that only 25,184+1,500 images have ground truth, and the remaining 3,000 images do not. Where can I find the ground truth for the test set?

  2. SIIM-ACR Pneumothorax: When fine-tuning, is the approach the same as with MRM? Do you still include around 5,000 images without pneumothorax lesions for training, or is the final dataset for training reduced to only around 7,000 images (similar to MGCA), excluding those without lesions?

ToniChopp commented 9 months ago

Hi, thanks for your attention to our work.

1. RSNA Pneumonia: We follow MRM to get the ground truth labels for the test set. The corresponding labels can be accessed here 2. SIIM-ACR Pneumothorax: We do not follow the approach of MRM which utilizes mmsegmentation. Our implement of fine-tune segmentation framework will be released soon. In detail, we only keep positive samples for segmentation (similar to MGCA).

I hope the above info is helpful!

Eldo-rado commented 9 months ago

Thank you for your prompt response! I can obtain the classification labels for the RSNA test set here, but where should I get the corresponding segmentation labels/masks?

ToniChopp commented 9 months ago

There is no official RSNA test set segmentation labels released. Therefore, we follow the split of MGCA, randomly split the original training set into 16, 010/5, 337/5, 337 for training/validation/testing

Eldo-rado commented 9 months ago

Thank you, I have a general understanding.

ToniChopp commented 9 months ago

Thank you for your understanding!

Eldo-rado commented 9 months ago

Got it, thank u ;-)

Eldo-rado commented 9 months ago

Hi, I apologize for the interruption again. I would like to confirm whether, during the reproduction of KAD in Table 2, the DQN module was set to be frozen? Specifically, was only the last linear layer enabled, or was only the backbone(resnet50) frozen while the other parts remained trainable?

ToniChopp commented 9 months ago

We conduct linear classification with only the last linear layer trainable. We solely load the vision backbone parameters of the official released KAD model, then freeze the backbone.