JJGO / UniverSeg

UniverSeg: Universal Medical Image Segmentation
Apache License 2.0
481 stars 49 forks source link

a question about training #24

Open sharonlee12 opened 3 months ago

sharonlee12 commented 3 months ago

Hello! I encountered an issue while reproducing the training part code of your paper: I trained on multiple tasks, but it seems that I could only achieve the segmentation of task A in the end. The segmentation results on other tasks are also the segmentation of A. This may be because the model can only learn A, But the difference in data volume is not significant, and data augmentation has also been performed. My concern is - how do you implement the learning of multiple tasks in the code? Because the structure of universeg is mainly aimed at single tasks and single tags, I would like to know how you handle it

VictorButoi commented 3 months ago

For our training procedure, we randomly sample tasks (from our big collection of possible task tuples) during training (for simplicity lets say a task at a time) such that we don't perfectly fit to any one task.

I'm curious what you mean by only being able to achieve the segmentation of task A in the end?

sharonlee12 commented 3 months ago

In my segmentation, datasets include: LITS and MSD Pancreas. During my training process, the loss decreased normally, while indicators such as dice, recall, and precision increased. However, during testing, I found that for Pancreas, the segmentation results obtained during testing were actually the liver part of LITS, just like this: image

while the groundtruth is image I stored the support labels during testing, which is a normal image, so I suspect it may be an issue with my training code.

In addition, I also discovered one thing:

Due to the fact that universeg mainly targets single label tasks, LITS has two labels: liver and liver tuber, while pancras also has two labels: pancras and pancras tuber. I found that the inference results are close to the liver for the pancreas task of MSD pancreas, and close to the liver tumor for the pancreas tumor task of MSD pancreas.

sharonlee12 commented 3 months ago

For our training procedure, we randomly sample tasks (from our big collection of possible task tuples) during training (for simplicity lets say a task at a time) such that we don't perfectly fit to any one task.

I'm curious what you mean by only being able to achieve the segmentation of task A in the end?

In my segmentation, datasets include: LITS and MSD Pancreas. During my training process, the loss decreased normally, while indicators such as dice, recall, and precision increased. However, during testing, I found that for Pancreas, the segmentation results obtained during testing were actually the liver part of LITS, just like this: image while the groundtruth is image I stored the support labels during testing, which is a normal image, so I suspect it may be an issue with my training code.

In addition, I also discovered one thing:

Due to the fact that universeg mainly targets single label tasks, LITS has two labels: liver and liver tuber, while pancras also has two labels: pancras and pancras tuber. I found that the inference results are close to the liver for the pancreas task of MSD pancreas, and close to the liver tumor for the pancreas tumor task of MSD pancreas.

VictorButoi commented 3 months ago

During testing, I found that for Pancreas, the segmentation results obtained during testing were actually the liver part of LITS, just like this.

I think that this points to the fact that your trained universeg network is overfitting to the training tasks. For universeg the way that we avoid overfitting on individual labels or to particular datasets really is to train on a large amount of datasets, in our scale about 47, but definitely just a few might be insufficient.

tubixiansheng commented 3 months ago

Hello! I encountered an issue while reproducing the training part code of your paper: I trained on multiple tasks, but it seems that I could only achieve the segmentation of task A in the end. The segmentation results on other tasks are also the segmentation of A. This may be because the model can only learn A, But the difference in data volume is not significant, and data augmentation has also been performed. My concern is - how do you implement the learning of multiple tasks in the code? Because the structure of universeg is mainly aimed at single tasks and single tags, I would like to know how you handle it

Hello, may I have look at your training code? I am looking forward to your reply very much

sharonlee12 commented 3 months ago

During testing, I found that for Pancreas, the segmentation results obtained during testing were actually the liver part of LITS, just like this.

I think that this points to the fact that your trained universeg network is overfitting to the training tasks. For universeg the way that we avoid overfitting on individual labels or to particular datasets really is to train on a large amount of datasets, in our scale about 47, but definitely just a few might be insufficient.

hello,I am sorry to bother you,but I want to know how you pairwise concatenated the images and the labels in the channel dimension?I guess my codes of this are not right, I am looking forward to your reply!

VictorButoi commented 3 months ago

Here is how we do it: (available in the universeg/model.py file)

` def forward(self, target_image, support_images, support_labels):

    target = E.rearrange(target_image, "B 1 H W -> B 1 1 H W")
    support = torch.cat([support_images, support_labels], dim=2)

    pass_through = []

`

sharonlee12 commented 3 months ago

Here is how we do it: (available in the universeg/model.py file)

` def forward(self, target_image, support_images, support_labels):

    target = E.rearrange(target_image, "B 1 H W -> B 1 1 H W")
    support = torch.cat([support_images, support_labels], dim=2)

    pass_through = []

`

Thank you for your reply!I have found it in model.py!But I have another question:May I ask how you perform loss. backward() and optimizer. step()? For example, for a multi label task, such as a dataset with background 0, label 1, and label 2, do you calculate the loss for each label, and then perform loss. backward() and optimizer. step() ?that is to say, perform loss. backward() and optimizer. step() on each label ; or do you calculate the sum of losses for label 1 and label 2, and then perform loss. backward() and optimizer. step()? I adopted the latter and found that due to label 1 being relatively easy to learn and label 2 being difficult to learn, the model seems to have been unable to learn label 2

adalca commented 3 months ago

At each iteration we sample one of the labels (for both the support set and target seg) and pretend it's a single-label task (for that iteration).