Cannot reproduce - Githubissues

Wxsxulin commented 3 months ago

Hello author, Based on your code, I am unable to reproduce your experimental results. The loss will suddenly become very strange. Could you please ask if there are any issues with the dataset settings or code? Thanks in advance!

dengxl0520 commented 3 months ago

Please use the latest dev branch refactored based on the main branch.

Wxsxulin commented 3 months ago

Thank you for your reply! After you updated the code, I used the latest version of the code, but unfortunately there is still an issue that cannot be reproduced, and I cannot obtain the correct train_loss. I printed the tensors of pred and target, but did not find any problems. I think the problem may be in the criteria function. In addition, a warning pops up when calculating mean-dice to calculate an empty array. I earnestly request your assistance. Thanks in advance!

Zong-Liang commented 3 months ago

Thank you for your reply! After you updated the code, I used the latest version of the code, but unfortunately there is still an issue that cannot be reproduced, and I cannot obtain the correct train_loss. I printed the tensors of pred and target, but did not find any problems. I think the problem may be in the criteria function. In addition, a warning pops up when calculating mean-dice to calculate an empty array. I earnestly request your assistance. Thanks in advance!

Same problem, an error "RuntimeError: GET was unable to find an engine to execute this computation" will be reported when running to the loss.backward() line.

dengxl0520 commented 3 months ago

Thank you for your reply! After you updated the code, I used the latest version of the code, but unfortunately there is still an issue that cannot be reproduced, and I cannot obtain the correct train_loss. I printed the tensors of pred and target, but did not find any problems. I think the problem may be in the criteria function. In addition, a warning pops up when calculating mean-dice to calculate an empty array. I earnestly request your assistance. Thanks in advance!

Same problem, an error "RuntimeError: GET was unable to find an engine to execute this computation" will be reported when running to the loss.backward() line.

Based on your error message, it doesn't seem to be caused by our code.

dengxl0520 commented 3 months ago

Thank you for your reply! After you updated the code, I used the latest version of the code, but unfortunately there is still an issue that cannot be reproduced, and I cannot obtain the correct train_loss. I printed the tensors of pred and target, but did not find any problems. I think the problem may be in the criteria function. In addition, a warning pops up when calculating mean-dice to calculate an empty array. I earnestly request your assistance. Thanks in advance!

I'm not sure what's happening, you can check the values in pred and gt.

Wxsxulin commented 3 months ago

Thank you for your reply I printed out some of the pred and mask values according to your suggestion and found that the tensor values in the mask were all empty. Can you help solve this problem? Thank you very much!

JinrongLv commented 3 months ago

I also encountered the same mistake. Can the author provide a screenshot of their normal training? The loss is still not very normal, and after completing the training, it is always negative. My training parameters are as follows: --modelname MemSAM --task CAMUS_Video_Full --keep_log

dengxl0520 commented 3 months ago

I also encountered the same mistake. Can the author provide a screenshot of their normal training? The loss is still not very normal, and after completing the training, it is always negative. My training parameters are as follows: --modelname MemSAM --task CAMUS_Video_Full --keep_log

This is the end of a normal training session: The train command: python train_video.py --modelname MemSAM --task CAMUS_Video_Full --keep_log

JinrongLv commented 3 months ago

Is the code used by the author for training consistent with the repository code? I also encountered this problem while training EchoNet:

Zong-Liang commented 3 months ago

python train_video.py --modelname MemSAM --task CAMUS_Video_Full --keep_log

no more update? I just modified the data_path in config.py, then used your command, the issue still exists:

Wxsxulin commented 2 months ago

Hello, I would like to ask if there are any more code updates available？

wuyu-sile commented 1 month ago

Hello, I would like to ask if there are any more code updates available？

我已经解决该问题了，猜测原因是作者未给出专门用于处理camus数据集的代码部分，因此导致camus的mask张量中出现255，正确的mask张量中的元素应该是非0即1的，因此需要在data_us文件中的第501行加入代码：mask[mask == 255] = 1。 memsam

Wxsxulin commented 1 month ago

您好，请问还有其他代码更新吗？

我已经解决了这个问题了，猜测原因是作者未给出专门用于处理camus数据集的代码部分，因此导致camus的掩码张量中出现255，正确的掩码张量中的元素应该是非0即1的，因此需要在data_us文件中的第501行加入代码：mask[mask == 255] = 1。

非常感谢！

dengxl0520 / MemSAM

Cannot reproduce #18