Error on gpus set to 0 - Githubissues

arsalan993 commented 3 years ago

Although i set gpus = 0 since i dont have any GPU installed scripted did predicted the output and saved it in prediction.txt file but right before exiting run.py file compiler throughs this error

Starting re-scoring ...

{'<arg1>': '[unused1]', '</arg1>': '[unused2]', '<rel>': '[unused3]', '</rel>': '[unused4]', '<arg2>': '[unused5]', '</arg2>': '[unused6]', 'SENT': '[unused7]', 'PRED': '[unused8]', '@COPY@': '[unused9]', 'EOE': '[unused10]'}
Traceback (most recent call last):
  File "run.py", line 469, in <module>
    main(hyperparams)
  File "run.py", line 459, in main
    train_dataloader, val_dataloader, test_dataloader, all_sentences)
  File "run.py", line 255, in splitpredict
    rescored = rescore(inp_fp, model_dir=hparams.rescore_model, batch_size=256)
  File "imojie/imojie/aggregate/score.py", line 90, in rescore
    return generate_probs(model_dir, inp_fp, weights_fp, topk, out_ext, cuda_device, overwrite=overwrite, extraction_ratio=ext_ratio, batch_size=batch_size, out=None)
  File "imojie/imojie/aggregate/score.py", line 39, in generate_probs
    probs = evaluate_from_args(args)
  File "/home/arsalan/Documents/openie6/topic/lib/python3.6/site-packages/allennlp/commands/evaluate.py", line 131, in evaluate_from_args
    archive = load_archive(args.archive_file, args.cuda_device, args.overrides, args.weights_file)
  File "/home/arsalan/Documents/openie6/topic/lib/python3.6/site-packages/allennlp/models/archival.py", line 230, in load_archive
    cuda_device=cuda_device)
  File "/home/arsalan/Documents/openie6/topic/lib/python3.6/site-packages/allennlp/models/model.py", line 329, in load
    return cls.by_name(model_type)._load(config, serialization_dir, weights_file, cuda_device)
  File "/home/arsalan/Documents/openie6/topic/lib/python3.6/site-packages/allennlp/models/model.py", line 277, in _load
    model_state = torch.load(weights_file, map_location=util.device_mapping(cuda_device))
  File "/home/arsalan/Documents/openie6/topic/lib/python3.6/site-packages/torch/serialization.py", line 386, in load
    return _load(f, map_location, pickle_module, **pickle_load_args)
  File "/home/arsalan/Documents/openie6/topic/lib/python3.6/site-packages/torch/serialization.py", line 573, in _load
    result = unpickler.load()
  File "/home/arsalan/Documents/openie6/topic/lib/python3.6/site-packages/torch/serialization.py", line 536, in persistent_load
    deserialized_objects[root_key] = restore_location(obj, location)
  File "/home/arsalan/Documents/openie6/topic/lib/python3.6/site-packages/torch/serialization.py", line 409, in restore_location
    result = map_location(storage, location)
  File "/home/arsalan/Documents/openie6/topic/lib/python3.6/site-packages/allennlp/nn/util.py", line 828, in inner_device_mapping
    return storage.cuda(cuda_device)
  File "/home/arsalan/Documents/openie6/topic/lib/python3.6/site-packages/torch/_utils.py", line 69, in _cuda
    with torch.cuda.device(device):
  File "/home/arsalan/Documents/openie6/topic/lib/python3.6/site-packages/torch/cuda/__init__.py", line 243, in __enter__
    self.prev_idx = torch._C._cuda_getDevice()
  File "/home/arsalan/Documents/openie6/topic/lib/python3.6/site-packages/torch/cuda/__init__.py", line 178, in _lazy_init
    _check_driver()
  File "/home/arsalan/Documents/openie6/topic/lib/python3.6/site-packages/torch/cuda/__init__.py", line 99, in _check_driver
    http://www.nvidia.com/Download/index.aspx""")
AssertionError: 
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx

If you suspect this is an IPython 7.16.1 bug, please report it at:
    https://github.com/ipython/ipython/issues
or send an email to the mailing list at ipython-dev@python.org

You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.

Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
    %config Application.verbose_crash=True

Martin36 commented 3 years ago

I get the same error. I have a GPU but I'm running Ubuntu in Windows so I can't use it without installing the Windows Insider beta version of Windows.

If a GPU is required to run, it would be nice to have that stated in the documentation.

But from what I can see, it seems that predictions on the input text were made. So maybe the error does not affect the results.

sadakmed commented 2 years ago

though it doesn't effect ur desired output bcoz it's generated before the error. However, copying the command from readme file as it is, including the argument --mode as splitpredict is the issue here. setting it as predict will just predict and stop before the error. but the results will be not good as the first one.

lapp0 commented 1 year ago

The issue lies in openie6/run.py.

rescored = rescore(inp_fp, model_dir=hparams.rescore_model, batch_size=256)

needs to be changed to

rescored = rescore(inp_fp, model_dir=hparams.rescore_model, batch_size=256, cuda_device=(0 if has_cuda else -1))

https://github.com/dair-iitd/openie6/blob/master/imojie/imojie/aggregate/score.py#L88 is called with cuda_device=0

This results in a call to

https://github.com/dair-iitd/openie6/blob/master/imojie/allennlp/allennlp/commands/evaluate.py#L92

with cuda device of 0, but as the documentation specifies, -1 is required for no GPU.

RUN sed -i 's|rescored = rescore(inp_fp, model_dir=hparams.rescore_model, batch_size=256)|rescored = rescore(inp_fp, model_dir=hparams.rescore_model, batch_size=256, cuda_device=(0 if has_cuda else -1))|' openie6/run.py worked in my dockerfile. A bit busy to make a PR, but may tackle it later.

dair-iitd / openie6

Error on gpus set to 0 #4