MathieuNlp / Sam_LoRA

Segment Your Ring (SYR) - Segment Anything model adapted with LoRA to segment rings.
MIT License
60 stars 7 forks source link

Two questions that I hope to consult with you #5

Open luobendewugong opened 2 months ago

luobendewugong commented 2 months ago

Thank you very much for your sharing. I have two questions that I hope to consult with you.

1.If I would like to incorporate a validation set into your code, how should the code be modified?

2.When I change the BATCH_SIZE to 2 or 3, I encounter the following error. How should I modify the code to avoid such errors?

Traceback (most recent call last): File "c:/Users/PC/Documents/Code/Sam_LoRA-main/train.py", line 59, in <module> stk_gt, stk_out = utils.stacking_batch(batch, outputs) File "c:\Users\PC\Documents\Code\Sam_LoRA-main\src\utils.py", line 101, in stacking_batch stk_gt = torch.stack([b["ground_truth_mask"] for b in batch], dim=0) RuntimeError: stack expects each tensor to be equal size, but got [300, 450] at entry 0 and [500, 600] at entry 1

Thank you very much!

MathieuNlp commented 2 months ago

Hi, happy that you are interested.

  1. An approach is to split the training dataset into a train set and validation set. You could then create a function to evaluate the validation set like in the ./inference_eval.py (line 35 to 51) with the score you want. Finally print/plot the train loss and validation loss to save the best checkpoint.

  2. You will need to add a padding (zero padding will work).I used a batch size of 1 so the stacking of tensors wasn’t a problem. However when you go into a larger batch size, the stacking will need the height width dimensions to be equal. The labels in the dataset are not the same size so will need to crop the images (the original ones) or add a transformation in the dataloader process that will crop, center all the images so that you will only need to stack. You can add the transformation in the DatasetSegmentation object in src/dataloader.py class.

Hope it helps.

luobendewugong commented 1 month ago

Hi, happy that you are interested.

  1. An approach is to split the training dataset into a train set and validation set. You could then create a function to evaluate the validation set like in the ./inference_eval.py (line 35 to 51) with the score you want. Finally print/plot the train loss and validation loss to save the best checkpoint.
  2. You will need to add a padding (zero padding will work).I used a batch size of 1 so the stacking of tensors wasn’t a problem. However when you go into a larger batch size, the stacking will need the height width dimensions to be equal. The labels in the dataset are not the same size so will need to crop the images (the original ones) or add a transformation in the dataloader process that will crop, center all the images so that you will only need to stack. You can add the transformation in the DatasetSegmentation object in src/dataloader.py class.

Hope it helps.

Thank you very much for your reply, and I will try it. Additionally, I would like to consult you, have you tried the checkpoints of sam's sam_vit_l_0b3195 or sam_vit_h_4b8939? I tried to train with these two checkpoints and got a shape error, is there a part of code that needs to be modified?

MathieuNlp commented 1 month ago

Hello,

Could you show me the shape error you got? I have answered a question regarding the loading of other ViT sizes here: https://github.com/MathieuNlp/Sam_LoRA/issues/7