lin-tan / CURE

For our ICSE21 paper "CURE: Code-Aware Neural Machine Translation for Automatic Program Repair" by Nan Jiang, Thibaud Lutellier, and Lin Tan
https://www.cs.purdue.edu/homes/lintan/publications/cure-icse21.pdf
Other
50 stars 17 forks source link

What CUDA version did you use? #6

Closed nashid closed 2 years ago

nashid commented 2 years ago

The dependency listed does not specify the CUDA version. What CUDA version did you use?

nashid commented 2 years ago

@jiang719 we get the following error while training:

/XXXXXXX/python3.8/site-packages/numpy/core/_methods.py:190: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
Traceback (most recent call last):
  File "src/trainer/gpt_conut_trainer.py", line 247, in <module>
    trainer.train(model_id, epochs, hyper_parameter, save_dir=os.path.abspath(os.path.join(GPT_CONUT_TRAINER_DIR, '..', '..', 'data/models/')))
  File "src/trainer/gpt_conut_trainer.py", line 213, in train
    self.validate_and_save(model_id, save_dir)
  File "src/trainer/gpt_conut_trainer.py", line 125, in validate_and_save
    torch.save(checkpoint, save_dir + '/' + 'gpt_conut_' + str(model_id) + '.pt')
  File "/XXXXXXX/lib/python3.8/site-packages/torch/serialization.py", line 379, in save
    _save(obj, opened_zipfile, pickle_module, pickle_protocol)
  File "/XXXXXXX/lib/python3.8/site-packages/torch/serialization.py", line 601, in _save
    storage = storage.cpu()
  File "/XXXXXXX/lib/python3.8/site-packages/torch/storage.py", line 112, in cpu
    return torch._UntypedStorage(self.size()).copy_(self, False)
RuntimeError: CUDA error: device-side assert triggered

What cuda version have you used?

jiang719 commented 2 years ago

We have tried on both CUDA 10.0 and CUDA 11.3 and they both work.