Closed ggao33 closed 3 years ago
Hi, thanks for your interest.
Our provided enviornment.yml should work fine with CUDA 10.1. If you are using CUDA 11.1, please install pytorch 1.7.1. Please also note only some latest PyTorch versions (e.g. >1.7.0) works on CUDA 11 machines. Otherwise the program may get stuck.
Hi, I am currently facing this issue below, when running train.py. Could you plz give me a hand? My pc env is under:
/home/anaconda3/bin/python /home/Documents/Invertible-ISP-main/train_cuda.py --task=debug --data_path=./data/ --gamma --aug --camera=NIKON_D700 --out_path=./exps/ --debug_mode Parsed arguments: Namespace(aug=True, batch_size=1, camera='NIKON_D700', data_path='./data/', debug_mode=True, gamma=True, loss='L1', lr=0.0001, out_path='./exps/', resume=False, rgb_weight=1, task='debug') [INFO] Start data loading and preprocessing [INFO] Start to train Traceback (most recent call last): File "/home/Documents/Invertible-ISP-main/train_cuda.py", line 99, in
main(args)
File "/home/Documents/Invertible-ISP-main/train_cuda.py", line 72, in main
reconstruct_raw = net(reconstruct_rgb, rev=True)
File "/home/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, *kwargs)
File "/home/Documents/Invertible-ISP-main/model/model.py", line 176, in forward
out = op.forward(out, rev)
File "/home/Documents/Invertible-ISP-main/model/model.py", line 124, in forward
self.s = self.clamp (torch.sigmoid(self.H(x1)) * 2 - 1)
RuntimeError: CUDA error: an illegal memory access was encountered
Process finished with exit code 1
If switched to invertible-isp as your environment.yml said, the code somehow ghost stopped at line 22: DiffJPEG = DiffJPEG(differentiable=True, quality=90).cuda() without showing any errors nor printing "start to train"