Closed devamsheth21 closed 4 months ago
Hi @devamsheth21 , Thanks for using our repo. You can generate templated texts from any arbitrary image+label datasets. So yes, u can use RSNA dataset for that matter with following caveats:
With these points, here are the resources to follow to generate reports from labels and include them in the pretraining:
Image-label dataset
in our readme.Lmk if you have any further queries.
Hi @shantanu-ai ,
I have a question regarding the Classification task and table 1 from the paper. :
In this table, in the last row, the Mammo-CLIP is pre-trained on the UPMC and VinDr dataset, and finetuned as well as linear probing is done on the VinDr dataset. Now I read the dataset section and VinDr has two splits train and test. So which split is being used for pretraining? Is the same split being used for finetuning and linear probing? The AUC and accuracy results are from which test split?
Also, please clarify if I understood your method correctly: for linear probing and finetuning, you are adding a linear layer on top of the pre-trained vision encoder and training this architecture with CrossEntropy loss and labels. Then for evaluating on a different split of data or held-out test set, right?
Hi @devamsheth21 , for pretraining using UPMC+VinDR, we use the training set of official vindr. We also split the 10% of the training set for validation. The original test set was completely held-out during pretraining.
For downstream tasks (both linear probing and finetuning), we use the same training set of vindr which was used in pre-training. The numbers here are based on the official test set of vindr.
For linear probing+finetuning, we attach a linear layer on top of the vision encoder. For linear probing, the backbone vision encoder is fixed. For finetuning, the vision encoder was also finetuned. Yes, we use cross Ent loss for training. All evaluation is done of on a held-out test set.
Hi, I wanted to pre-train the same model with the RSNA dataset. However, since RSNA doesn't have text reports, can we generate the templated text reports from the RSNA dataset attributes using the preprocessing you used for the VinDr dataset ? if so, what modifications would you recommend to the RSNA csv file..?
Thank you