yun-liu / RCF

Richer Convolutional Features for Edge Detection
Other
760 stars 259 forks source link

Label and cross entropy ERROR #160

Open 2000LuoLuo opened 1 month ago

2000LuoLuo commented 1 month ago

This is my problem, please help me. How can I solve it?

C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\cuda\Loss.cu:106: block: [107,0,0], thread: [117,0,0] Assertion target_val >= zero && target_val <= one failed. C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\cuda\Loss.cu:106: block: [107,0,0], thread: [118,0,0] Assertion target_val >= zero && target_val <= one failed. C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\cuda\Loss.cu:106: block: [107,0,0], thread: [119,0,0] Assertion target_val >= zero && target_val <= one failed. C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\cuda\Loss.cu:106: block: [6,0,0], thread: [12,0,0] Assertion target_val >= zero && target_val <= one failed. C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\cuda\Loss.cu:106: block: [79,0,0], thread: [5,0,0] Assertion target_val >= zero && target_val <= one failed. Traceback (most recent call last): File "C:\Users\LHT\Desktop\RCF_Pytorch_Updated-master\train_RCF.py", line 353, in main() File "C:\Users\LHT\Desktop\RCF_Pytorch_Updated-master\train_RCF.py", line 221, in main tr_avg_loss, tr_detail_loss = train( File "C:\Users\LHT\Desktop\RCF_Pytorch_Updated-master\train_RCF.py", line 257, in train loss = loss + cross_entropy_loss_RCF(o, label) File "C:\Users\LHT\Desktop\RCF_Pytorch_Updated-master\functions.py", line 11, in cross_entropy_loss_RCF mask[mask == 1] = 1.0 * num_negative / (num_positive + num_negative) RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

yun-liu commented 1 month ago

@2000LuoLuo The labels should be within the range of [0, 1]. It seems that your labels are out of this range.

2000LuoLuo commented 1 month ago

Thank you for your reply. Can you help me find out where the problem lies? The program I downloaded from RCF-PyTorch has almost no changes.

I refer to this article https://blog.csdn.net/ruotianxia/article/details/100066181?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522169534780616800225534833%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=169534780616800225534833&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~blog~sobaiduend~default-2-100066181-null-null.nonecase&utm_term=bsds_pascal_train_pair.lst&spm=1018.2226.3001.4450

but the same error is still reported. My environment is cuda12.1, pytroch2.4.1, even if I lower the The environment version and reducing the learning rate all have the same error, but I clearly use the same program as others. I look forward to your reply. Thank you.

yun-liu commented 1 month ago

@2000LuoLuo Please use the provided data for training and don't modify the code.

2000LuoLuo commented 1 month ago

Yes, this is what I am confused about. I ran it exactly according to the usage instructions and downloaded the data set you provided. I only corrected the file path to the correct location and did not modify the other programs, but I still reported an error. I searched all night and still couldn't find the reason.

yun-liu commented 1 month ago

@2000LuoLuo This may be because of the new version of PyTorch. I have made some modifications to the code. Please try it again.