INFO marie@37 Executing pipeline for document : PID_1956_9362_0_203925852.tif, lbxid > /tmp/generators/a9de56b33b040d12568f379e0078684a
INFO marie@37 Executing pipeline runtime_conf : {'name': 'default-corr', 'page_splitter': {'enabled': False}, 'type': 'pipeline', 'page_cleaner': {'enabled':
False}, 'page_classifier': {'enabled': True}}
INFO marie@37 Feature : page classifier enabled : True
INFO marie@37 Feature : page indexer enabled : True
INFO marie@37 Loaded classifiers : corr-classifier, 3
INFO marie@37 Loaded classifiers : corr-payer-classifier, 3
INFO marie@37 Restoring assets from s3://marie/lbxid/pid_1956_9362_0_203925852 to /tmp/generators/a9de56b33b040d12568f379e0078684a [05/15/24 14:38:45]
INFO marie@37 Bursting frames for PID_1956_9362_0_203925852.tif
INFO marie@37 Processing classifier pipeline/group : default-corr, corr-classifier
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [166,0,0], thread: [127,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
ERROR marie@37 Error classifying document : CUDA error: device-side assert triggered [05/15/24 14:38:45]
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Traceback (most recent call last):
File "/opt/venv/lib/python3.10/site-packages/marie/components/document_classifier/transformers.py", line 244, in predict
for results in pipe_batched_results:
File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/pt_utils.py", line 124, in __next__
item = next(self.iterator)
File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/pt_utils.py", line 125, in __next__
processed = self.infer(item, **self.params)
File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 1067, in forward
model_inputs = self._ensure_tensor_on_device(model_inputs, device=self.device)
File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 972, in _ensure_tensor_on_device
return UserDict({name: self._ensure_tensor_on_device(tensor, device) for name, tensor in inputs.items()})
File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 972, in <dictcomp>
return UserDict({name: self._ensure_tensor_on_device(tensor, device) for name, tensor in inputs.items()})
File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 980, in _ensure_tensor_on_device
return inputs.to(device)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
ERROR marie@37 Error while classifying documents: CUDA error: device-side assert triggered [05/15/24 14:38:45]
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Describe the bug
Application crashes the GPU. Sample document ID.
CRASHES WITH CUDA ERROR: DEVICE-SIDE ASSERT TRIGGERED 204092337 203925852 204092927 204092966 204166227 204041606 204040160
205967262 - EOB (medical_page_classifier) 208788841 - CORR 209425570 - CORR 209805214 - CORR 209976466 - CORR 211567153 - CORR 212670800 - CORR 213805705 - CORR / ROTATED 213942700 - CORR / ROTATED 214292051 - CORR 214288815 - CORR 214291267 - CORR / ROTATED 214292900 - CORR / LARGE 214894529 - CORR / ENVELOPE