I tried to run the code by using three different setups, but I always get the same error:
Traceback (most recent call last):
File "train_task.py", line 34, in <module>
from volta.task_utils import LoadDataset, LoadLoss, ForwardModelsTrain, ForwardModelsVal
File "/data/volta/task_utils.py", line 19, in <module>
from volta.datasets import DatasetMapTrain, DatasetMapEval
File "/data/volta/datasets/__init__.py", line 23, in <module>
from .SVO_Probes_dataset import SVO_ProbesClassificationDataset
File "/data/volta/datasets/SVO_Probes_dataset.py", line 20, in <module>
p = Pipeline(lang='english', gpu = True, cache_dir = './cache')
File "/root/anaconda3/envs/volta/lib/python3.6/site-packages/trankit/pipeline.py", line 85, in __init__
self._embedding_layers.half()
File "/root/anaconda3/envs/volta/lib/python3.6/site-packages/torch/nn/modules/module.py", line 757, in half
return self._apply(lambda t: t.half() if t.is_floating_point() else t)
File "/root/anaconda3/envs/volta/lib/python3.6/site-packages/torch/nn/modules/module.py", line 570, in _apply
module._apply(fn)
File "/root/anaconda3/envs/volta/lib/python3.6/site-packages/torch/nn/modules/module.py", line 570, in _apply
module._apply(fn)
File "/root/anaconda3/envs/volta/lib/python3.6/site-packages/torch/nn/modules/module.py", line 570, in _apply
module._apply(fn)
File "/root/anaconda3/envs/volta/lib/python3.6/site-packages/torch/nn/modules/module.py", line 593, in _apply
param_applied = fn(param)
File "/root/anaconda3/envs/volta/lib/python3.6/site-packages/torch/nn/modules/module.py", line 757, in <lambda>
return self._apply(lambda t: t.half() if t.is_floating_point() else t)
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
I followed the Repository setup steps in the README file (plus, after the setup I also needed to install nltk through anaconda and trankit through pip). These setups were:
Run it on a VM with Ubuntu 22.04, NVIDIA RTX 3090, CUDA 12.1 and NVIDIA driver version 530.30.02
Run it on the same virtual machine, but inside a Docker container nvidia/cuda:10.1-devel-ubuntu18.04
Run it on the same virtual machine, but inside a Docker container pytorch/pytorch:1.6.0-cuda10.1-cudnn7-devel
These are the exact commands I executed on both the VM and inside the Docker containers:
conda create -n volta python=3.6
conda activate volta
pip install -r requirements.txt
conda install pytorch=1.4.0 torchvision cudatoolkit=10.1 -c pytorch #Remove torchvision version as 0.5 is not available
apt install git
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir ./
cd ..
cd tools/refer; make
cd ../..
python setup.py develop
conda install nltk
pip install trankit
Then, I ran the following command and after that I received the error above:
Hi all,
I tried to run the code by using three different setups, but I always get the same error:
I followed the Repository setup steps in the README file (plus, after the setup I also needed to install nltk through anaconda and trankit through pip). These setups were:
These are the exact commands I executed on both the VM and inside the Docker containers:
Then, I ran the following command and after that I received the error above:
Any suggestions? Thanks!