xiaoman-zhang / KAD

MIT License
121 stars 10 forks source link

ValueError: Expected input batch_size (656) to match target batch_size (640). #21

Closed geoexploring closed 9 months ago

geoexploring commented 9 months ago

@xiaoman-zhang ,@chaoyi-wu , Thank you for your wonderful work.

I encountered an issue while running the file A3_CLIP/main.py.

The KAD/A3_CLIP/configs/Res_train.yaml is as follows:

train_entity_file: '/home/user/Desktop/KAD/KAD_DATA/A1_DATA/MIMIC-CXR/data_file/radgraph_umls.json'
train_entity_graph_file: '/home/user/Desktop/KAD/KAD_DATA/A1_DATA/MIMIC-CXR/data_file/radgraph_entity_graphs_new.json'
train_query_file:  '/home/user/Desktop/KAD/KAD_DATA/A1_DATA/MIMIC-CXR/data_file/radgraph_metric.csv'
train_fg_query_file:  '/home/user/Desktop/KAD/KAD_DATA/A1_DATA/MIMIC-CXR/data_file/fg_radgraph_metric.csv'
chestxray_train_file: '/home/user/Desktop/KAD/KAD_DATA/A1_DATA/ChestXray14/official_train.csv'
chestxray_valid_file: '/home/user/Desktop/KAD/KAD_DATA/A1_DATA/ChestXray14/official_valid.csv'
chestxray_test_file: '/home/user/Desktop/KAD/KAD_DATA/A1_DATA/ChestXray14/official_test.csv'
chexpert_train_file: '/home/user/Desktop/KAD/KAD_DATA/A1_DATA/CheXpert/train.csv'
chexpert_valid_file: '/home/user/Desktop/KAD/KAD_DATA/A1_DATA/CheXpert/valid.csv'
chexpert_test_file: '/home/user/Desktop/KAD/KAD_DATA/A1_DATA/CheXpert/test.csv'

padchest_all_test_file: '/home/user/Desktop/KAD/KAD_DATA/A1_DATA/PadChest/Physician_label193_all.csv'
mrsty_file: '/home/user/Desktop/KAD/KAD/A1_DATA/UMLS/MRSTY.csv'
img_res: 512
batch_size: 16
test_batch_size: 16
num_classes: 2
temp: 0.07
mlm_probability: 0.15
queue_size: 8192
momentum: 0.995
alpha: 0.4
optimizer: {opt: adamW, lr: 5e-5, weight_decay: 0.02}
schedular: {sched: cosine, lr: 5e-5, epochs: 100, min_lr: 1e-6, decay_rate: 1, warmup_lr: 1e-6, warmup_epochs: 20, cooldown_epochs: 0}

The error is as follows:

Traceback (most recent call last):
  File "/home/user/Desktop/KAD/KAD/A3_CLIP/main.py", line 310, in <module>
    main(args, config)
  File "/home/user/Desktop/KAD/KAD/A3_CLIP/main.py", line 166, in main
    train_stats = train(model, image_encoder, text_encoder, tokenizer, train_dataloader, optimizer, epoch, warmup_steps, device, lr_scheduler, args,config,writer) 
  File "/home/user/Desktop/KAD/KAD/A3_CLIP/engine/train_fg.py", line 116, in train
    loss_ce_image = ce_loss(pred_class_image.view(-1,2),label.view(-1)) 
  File "/home/user/Desktop/KAD/KADEnv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/user/Desktop/KAD/KADEnv/lib/python3.9/site-packages/torch/nn/modules/loss.py", line 1174, in forward
    return F.cross_entropy(input, target, weight=self.weight,
  File "/home/user/Desktop/KAD/KADEnv/lib/python3.9/site-packages/torch/nn/functional.py", line 3026, in cross_entropy
    return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
ValueError: Expected input batch_size (656) to match target batch_size (640).

The error line is the loss_ce_image = ce_loss(pred_class_image.view(-1,2),label.view(-1)), and I have printed the shapes of pred_class_image.view(-1,2) and label.view(-1) :

Shape of pred_class_image: torch.Size([656, 2])
Shape of label: torch.Size([640])
Shape of pred_class_image before view: torch.Size([16, 41, 2])
Shape of label before view: torch.Size([16, 40])

The Shape of pred_class_image before view is torch.Size([16, 41, 2]), The Shape of label before view is torch.Size([16, 40]), Could you please help me understand what the numbers 40 and 41 represent in this context, and how I might resolve the mismatch error?

Thank you for your assistance.

xiaoman-zhang commented 9 months ago

Thank you for the feedback. Could you confirm if you are following the steps outlined in ./A1_DATA/MIMIC-CXR/data_preprocess/run_preprocess.sh to generate the file fg_radgraph_metric.csv? I have updated the processed file in both Google Drive and Baidu Cloud, you can check if this file is same as what you used.

geoexploring commented 9 months ago

@xiaoman-zhang , Thank you very much for your response.

I apologize for any confusion my initial description may have caused. I have updated the question for clarity.

I generated the fg_radgraph_metric.csv file following the process outlined in ./A1_DATA/MIMIC-CXR/data_preprocess/run_preprocess.sh, but only chose the small part in the original fg_radgraph_metric.csv, as detailed in the uploaded file fg_radgraph_metric_test.csv.

Fortunately, I have resolved the issue by removing an element, such as 'pleural effusion', from the code text_list, reducing its count from 41 to 40, which allowed the program to run at normal. My goal was to understand the details of KAD for operational purposes rather than to replicate the precision of the paper, so I did not investigate further.

Please feel free to contact me if there are any questions.

Thanks.