Shantanu Ghosh1
, Clare B. Poynton2
, Shyam Visweswaran3,
Kayhan Batmanghelich1
1BU ECE, 2 BUMC, 3 Pitt DBMI
After going through the instruction, it is recommended to visit this link for any further clarification on pretraining. If we hear more queries, we may add a separate FAQs in the future.
Use environment.yml to setup the environment.
conda env create --name Mammo-CLIP -f environment.yml
conda activate Mammo-CLIP
Mammo-CLIP is implemented with following specification:
Download the original versions VinDr and RSNA from the links for downstream evaluations:
For the PNG images converted from the original Dicom images, as mentioned in the preprocessing steps in the paper, refer to the following links:
To preprocess the dicom images directly, follow the instructions in the next section. If you downloaded the PNG images, skip the preprocessing steps.
python ./src/preprocessing/preprocess_image_to_png_kaggle.py \
--phase="test" \
--base_folder="/ocean/projects/asc170022p/shg121/PhD/RSNA_Breast_Imaging/Dataset/RSNA_Cancer_Detection"
python ./src/preprocessing/preprocess_image_to_png_vindr.py \
--phase="test" \
--base_folder="/ocean/projects/asc170022p/shg121/PhD/RSNA_Breast_Imaging/Dataset/External/Vindr/vindr-mammo-a-large-scale-benchmark-dataset-for-computer-aided-detection-and-diagnosis-in-full-field-digital-mammography-1.0.0"
Our image-text dataset is an in-house dataset from UPMC. The sample csv: upmc_dicom_consolidated_final_folds_BIRADS_num_1_report.csv
Note the HISTORY
, FINDINGS
, and IMPRESSION
columns in the csv file. The FINDINGS
and IMPRESSION
columns are
used to generate the text for the image. The HISTORY
, FINDINGS
and IMPRESSION
columns contains templated text
due to privacy.
Next run the following command to augment the text with upmc_dicom_consolidated_final_folds_BIRADS_num_1_report.csv
file:
# input: upmc_dicom_consolidated_final_folds_BIRADS_num_1_report.csv
# output: clip_pretrain_100.csv
python ./src/codebase/augment_text.py \
--dataset-path="/ocean/projects/asc170022p/shg121/PhD/Mammo-CLIP/src/codebase/data_csv" \
--csv-path="upmc_dicom_consolidated_final_folds_BIRADS_num_1_report.csv" \
--dataset="upmc"
index | patient_id | laterality | image | view | CC | MLO | text | text_augment |
---|---|---|---|---|---|---|---|---|
0 | patient_id | laterality ('R' or 'L') | List of all image_paths for patient_id-laterality combo | List of views for patient_id-laterality combo (only 'CC' and 'MLO' are used) | List of image paths for CC view for patient_id-laterality combo | List of image paths for MLO view for patient_id-laterality combo | List of [findings, impression] | List of [augmented findings, augmented impression] |
step3
is
here: clip_pretrain_100.csvWe use VinDr dataset as image-label dataset. So if you are planning to use it in the pre-training setup, use the following notebook to preprocess the VinDr dataset:
.src/codebase/notebooks/preprocess-clip/VinDr.ipynb
When you download the VinDr dataset, you will get these two csv
files: breast-level_annotations.csv
and finding_annotations.csv
. We preprocess the finding_annotations.csv
file to
get vindr_detection_v1_folds.csv
. VinDr.ipynb
notebook requires vindr_detection_v1_folds.csv file as input and
generate clip_vindr_final.csv
file.
Use the following csv files as metadata for the downstream tasks (classification, detection, zero-shot):
Dataset | CSV |
---|---|
VinDr | vindr_detection_v1_folds.csv |
RSNA | train_folds.csv |
For detection/localization tasks, we have included the coordinates of the resized bounding boxes of VinDr in the above csv file. Somebody interested in resizing the bounding boxes by themselves, refer to this code.
Following are the pre-training checkpoints of Mammo-CLIP: | Model architecture | Checkpoints (Google drive) | Checkpoints (Hugging Face) |
---|---|---|---|
Best performance | Efficient-Net B5 | Efficient-Net B5 | |
Lightweight | Efficient-Net B2 | Efficient-Net B2 |
We have also uploaded the downstream checkpoints for classification and localization (both linear probe and finetuning) with the image encoder of Efficient-Net B5 Mammo-CLIP for fold 0 here.
Look for /ocean/projects/asc170022p/shg121/PhD
and replace it with your own path.
python ./Mammo-CLIP/src/codebase/train.py --config-name pre_train_b5_clip.yaml
BreastClipClassifier
in breast-clip-classifier.py
file.RetinaNet_efficientnet
in detector_mode.py
file.FOLD=0
CKPT="b2-model-best-epoch-10.tar"
DIR="./Mammo-CLIP/src/codebase/outputs/upmc_clip/b2_detector_period_n"
FULL_CKPT="$DIR/checkpoints/fold_$FOLD/$CKPT"
python ./src/codebase/eval_zero_shot_clip.py \
--config-name zs_clip.yaml hydra.run.dir=$DIR model.clip_check_point=$FULL_CKPT
Adjust the CKPT
and DIR
variables according to your setup.
python ./src/codebase/train_classifier.py \
--data-dir '/ocean/projects/asc170022p/shg121/PhD/RSNA_Breast_Imaging/Dataset' \
--img-dir 'External/Vindr/vindr-mammo-a-large-scale-benchmark-dataset-for-computer-aided-detection-and-diagnosis-in-full-field-digital-mammography-1.0.0/images_png' \
--csv-file 'External/Vindr/vindr-mammo-a-large-scale-benchmark-dataset-for-computer-aided-detection-and-diagnosis-in-full-field-digital-mammography-1.0.0/vindr_detection_v1_folds.csv' \
--clip_chk_pt_path "/ocean/projects/asc170022p/shg121/PhD/Mammo-CLIP/src/codebase/outputs/upmc_clip/b5_detector_period_n/checkpoints/fold_0/b5-model-best-epoch-7.tar" \
--data_frac 1.0 \
--dataset 'ViNDr' \
--arch 'upmc_breast_clip_det_b5_period_n_lp' \
--label "Mass" \
--epochs 30 \
--batch-size 8 \
--num-workers 0 \
--print-freq 10000 \
--log-freq 500 \
--running-interactive 'n' \
--n_folds 1 \
--lr 5.0e-5 \
--weighted-BCE 'y' \
--balanced-dataloader 'n'
data-dir
: root directory of the datasetimg-dir
: directory containing images, absolute path: data-dir/img-dir
csv-file
: csv file containing image paths and labels, absolute path: data-dir/csv-file
clip_chk_pt_path
: path to the checkpoint of the pre-trained Mammo-CLIP modeldataset
: dataset name, e.g., ViNDr
or RSNA
data_frac
: fraction of the dataset to use for training, e.g., 1.0
, 0.5
etcarch
: architecture of the model, e.g., upmc_breast_clip_det_b5_period_n_lp
for B5
or upmc_breast_clip_det_b2_period_n_lp
for B2label
: target label for classification, e.g., Mass
, Suspicious_Calcification
or density
for ViNDr
dataset; cancer
for RSNA datasetrunning-interactive
: running on interactive mode. In this mode,the training will be done using 100 samples for
sanity checkpython ./src/codebase/train_classifier.py \
--data-dir '/ocean/projects/asc170022p/shg121/PhD/RSNA_Breast_Imaging/Dataset' \
--img-dir 'External/Vindr/vindr-mammo-a-large-scale-benchmark-dataset-for-computer-aided-detection-and-diagnosis-in-full-field-digital-mammography-1.0.0/images_png' \
--csv-file 'External/Vindr/vindr-mammo-a-large-scale-benchmark-dataset-for-computer-aided-detection-and-diagnosis-in-full-field-digital-mammography-1.0.0/vindr_detection_v1_folds.csv' \
--clip_chk_pt_path "/ocean/projects/asc170022p/shg121/PhD/Mammo-CLIP/src/codebase/outputs/upmc_clip/b5_detector_period_n/checkpoints/fold_0/b5-model-best-epoch-7.tar" \
--data_frac 1.0 \
--dataset 'ViNDr' \
--arch 'upmc_breast_clip_det_b5_period_n_ft' \
--label "Mass" \
--epochs 30 \
--batch-size 8 \
--num-workers 0 \
--print-freq 10000 \
--log-freq 500 \
--running-interactive 'n' \
--n_folds 1 \
--lr 5.0e-5 \
--weighted-BCE 'y' \
--balanced-dataloader 'n'
data-dir
: root directory of the datasetimg-dir
: directory containing images, absolute path: data-dir/img-dir
csv-file
: csv file containing image paths and labels, absolute path: data-dir/csv-file
clip_chk_pt_path
: path to the checkpoint of the pre-trained Mammo-CLIP modeldataset
: dataset name, e.g., ViNDr
or RSNA
data_frac
: fraction of the dataset to use for training, e.g., 1.0
, 0.5
etcarch
: architecture of the model, e.g., upmc_breast_clip_det_b5_period_n_ft
for B5
or upmc_breast_clip_det_b2_period_n_ft
for B2label
: target label for classification, e.g., Mass
, Suspicious_Calcification
or density
for ViNDr
dataset; cancer
for RSNA datasetrunning-interactive
: running on interactive mode. In this mode,the training will be done using 100 samples for
sanity checkpython ./src/codebase/train_detector.py \
--data-dir '/ocean/projects/asc170022p/shg121/PhD/RSNA_Breast_Imaging/Dataset' \
--img-dir 'External/Vindr/vindr-mammo-a-large-scale-benchmark-dataset-for-computer-aided-detection-and-diagnosis-in-full-field-digital-mammography-1.0.0/images_png' \
--csv-file 'External/Vindr/vindr-mammo-a-large-scale-benchmark-dataset-for-computer-aided-detection-and-diagnosis-in-full-field-digital-mammography-1.0.0/vindr_detection_v1_folds.csv' \
--clip_chk_pt_path "/ocean/projects/asc170022p/shg121/PhD/Mammo-CLIP/src/codebase/outputs/upmc_clip/b5_detector_period_n/checkpoints/fold_0/b5-model-best-epoch-7.tar" \
--dataset 'ViNDr' \
--arch 'clip_b5_upmc' \
--epochs 120 \
--batch-size 7 \
--freeze_backbone "y" \
--data_frac 1.0 \
--concepts 'Mass' \
--print-freq 5000 \
--log-freq 300 \
--running-interactive 'n' \
--focal-alpha 0.25 \
--focal-gamma 2.0 \
--score-threshold 0.2
data-dir
: root directory of the datasetimg-dir
: directory containing images, absolute path: data-dir/img-dir
csv-file
: csv file containing image paths and labels, absolute path: data-dir/csv-file
clip_chk_pt_path
: path to the checkpoint of the pre-trained Mammo-CLIP modeldataset
: dataset name, e.g., ViNDr
data_frac
: fraction of the dataset to use for training, e.g., 1.0
, 0.5
etcarch
: architecture of the model, e.g., clip_b5_upmc
for B5 or clip_b2_upmc
for B2concepts
: target label for classification, e.g., Mass
, Suspicious Calcification
for ViNDr datasetrunning-interactive
: running on interactive mode. In this mode,the training will be done using 100 samples for
sanity checkfreeze_backbone
: freeze the backbone of the model, for linear probe, set to y
python ./src/codebase/train_detector.py \
--data-dir '/ocean/projects/asc170022p/shg121/PhD/RSNA_Breast_Imaging/Dataset' \
--img-dir 'External/Vindr/vindr-mammo-a-large-scale-benchmark-dataset-for-computer-aided-detection-and-diagnosis-in-full-field-digital-mammography-1.0.0/images_png' \
--csv-file 'External/Vindr/vindr-mammo-a-large-scale-benchmark-dataset-for-computer-aided-detection-and-diagnosis-in-full-field-digital-mammography-1.0.0/vindr_detection_v1_folds.csv' \
--clip_chk_pt_path "/ocean/projects/asc170022p/shg121/PhD/Mammo-CLIP/src/codebase/outputs/upmc_clip/b5_detector_period_n/checkpoints/fold_0/b5-model-best-epoch-7.tar" \
--dataset 'ViNDr' \
--arch 'clip_b5_upmc' \
--epochs 120 \
--batch-size 7 \
--freeze_backbone "n" \
--data_frac 1.0 \
--concepts 'Mass' \
--print-freq 5000 \
--log-freq 300 \
--running-interactive 'n' \
--focal-alpha 0.25 \
--focal-gamma 2.0 \
--score-threshold 0.2
data-dir
: root directory of the datasetimg-dir
: directory containing images, absolute path: data-dir/img-dir
csv-file
: csv file containing image paths and labels, absolute path: data-dir/csv-file
clip_chk_pt_path
: path to the checkpoint of the pre-trained Mammo-CLIP modeldataset
: dataset name, e.g., ViNDr
data_frac
: fraction of the dataset to use for training, e.g., 1.0
, 0.5
etcarch
: architecture of the model, e.g., clip_b5_upmc
for B5 or clip_b2_upmc
for B2concepts
: target label for classification, e.g., Mass
, Suspicious Calcification
for ViNDr datasetrunning-interactive
: running on interactive mode. In this mode,the training will be done using 100 samples for
sanity checkfreeze_backbone
: freeze the backbone of the model, for finetune, set to n
For all the training scripts, we add them in the scripts directory:
Scripts | Purpose |
---|---|
pretrain_mammo_clip_b5.sh | Pretrain Mammo-CLIP b5 |
pretrain_mammo_clip_b2.sh | Pretrain Mammo-CLIP b2 |
classifier_fine_tune_b5.sh | Evaluate Mammo-CLIP b5 on fine tuning tasks for classification |
classifier_fine_tune_b2.sh | Evaluate Mammo-CLIP b2 on fine tuning tasks for classification |
classifier_linear_probe_b5.sh | Evaluate Mammo-CLIP b5 on linear probing tasks for classification |
classifier_linear_probe_b2.sh | Evaluate Mammo-CLIP b2 on linear probing tasks for classification |
detector_fine_tune_b5.sh | Evaluate Mammo-CLIP b5 on fine tuning tasks for detection |
detector_fine_tune_b2.sh | Evaluate Mammo-CLIP b2 on fine tuning tasks for detection |
detector_linear_probe_b5.sh | Evaluate Mammo-CLIP b5 on linear probing tasks for detection |
detector_linear_probe_b2.sh | Evaluate Mammo-CLIP b2 on linear probing tasks for detection |
@article{ghosh2024mammo,
title={Mammo-CLIP: A Vision Language Foundation Model to Enhance Data Efficiency and Robustness in Mammography},
author={Ghosh, Shantanu and Poynton, Clare B and Visweswaran, Shyam and Batmanghelich, Kayhan},
journal={arXiv preprint arXiv:2405.12255},
year={2024}
}
Licensed under the Creative Commons Attribution 4.0 International
Copyright © Batman Lab, 2024
For any queries, contact: shawn24@bu.edu