RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasGemmStridedBatchedExFix

thistleknot commented 1 year ago

after pip install nougat-ocr

tried to run

nougat file.pdf -o .

downloading nougat checkpoint version 0.1.0-small to path /root/.cache/torch/hub/nougat
config.json: 100%|█████████████████████████████████████████████████████████████████████| 557/557 [00:00<00:00, 2.26Mb/s]
pytorch_model.bin: 100%|█████████████████████████████████████████████████████████████| 956M/956M [01:40<00:00, 9.95Mb/s]
special_tokens_map.json: 100%|████████████████████████████████████████████████████████| 96.0/96.0 [00:00<00:00, 464kb/s]
tokenizer.json: 100%|██████████████████████████████████████████████████████████████| 2.04M/2.04M [00:00<00:00, 14.0Mb/s]
tokenizer_config.json: 100%|████████████████████████████████████████████████████████████| 106/106 [00:00<00:00, 609kb/s]
/home/user/env-10/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
  0%|                                                                                           | 0/631 [00:00<?, ?it/s][W NNPACK.cpp:64] Could not initialize NNPACK! Reason: Unsupported hardware.
  0%|                                                                                           | 0/631 [05:59<?, ?it/s]
Traceback (most recent call last):
  File "/home/user/env-10/bin/nougat", line 8, in <module>
    sys.exit(main())
  File "/home/user/env-10/lib/python3.10/site-packages/predict.py", line 130, in main
    model_output = model.inference(image_tensors=sample)
  File "/home/user/env-10/lib/python3.10/site-packages/nougat/model.py", line 577, in inference
    last_hidden_state = self.encoder(image_tensors)
  File "/home/user/env-10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/env-10/lib/python3.10/site-packages/nougat/model.py", line 123, in forward
    x = self.model.layers(x)
  File "/home/user/env-10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/env-10/lib/python3.10/site-packages/torch/nn/modules/container.py", line 217, in forward
    input = module(input)
  File "/home/user/env-10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/env-10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 413, in forward
    x = blk(x)
  File "/home/user/env-10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/env-10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 295, in forward
    attn_windows = self.attn(x_windows, mask=self.attn_mask)  # nW*B, window_size*window_size, C
  File "/home/user/env-10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/env-10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 183, in forward
    attn = (q @ k.transpose(-2, -1))
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasGemmStridedBatchedExFix(handle, opa, opb, (int)m, (int)n, (int)k, (void*)&falpha, a, CUDA_R_16BF, (int)lda, stridea, b, CUDA_R_16BF, (int)ldb, strideb, (void*)&fbeta, c, CUDA_R_16BF, (int)ldc, stridec, (int)num_batches, CUDA_R_32F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)`
(env-10) root@m4700:/mnt/h/nougat# nvidia wat^C
(env-10) root@m4700:/mnt/h/nougat# watch -c nvidia-smi
(env-10) root@m4700:/mnt/h/nougat#

I don't see a requirements... I'm on python 3.10, using ubuntu 22 in wsl. I am able to run nvidia-smi

thistleknot commented 1 year ago

tried installing using .git just now, and same error

/home/user/env-10/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
  0%|                                                                                           | 0/631 [00:00<?, ?it/s][W NNPACK.cpp:64] Could not initialize NNPACK! Reason: Unsupported hardware.
  0%|                                                                                           | 0/631 [05:40<?, ?it/s]
Traceback (most recent call last):
  File "/home/user/env-10/bin/nougat", line 8, in <module>
    sys.exit(main())
  File "/home/user/env-10/lib/python3.10/site-packages/predict.py", line 130, in main
    model_output = model.inference(image_tensors=sample)
  File "/home/user/env-10/lib/python3.10/site-packages/nougat/model.py", line 577, in inference
    last_hidden_state = self.encoder(image_tensors)
  File "/home/user/env-10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/env-10/lib/python3.10/site-packages/nougat/model.py", line 123, in forward
    x = self.model.layers(x)
  File "/home/user/env-10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/env-10/lib/python3.10/site-packages/torch/nn/modules/container.py", line 217, in forward
    input = module(input)
  File "/home/user/env-10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/env-10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 413, in forward
    x = blk(x)
  File "/home/user/env-10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/env-10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 295, in forward
    attn_windows = self.attn(x_windows, mask=self.attn_mask)  # nW*B, window_size*window_size, C
  File "/home/user/env-10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/env-10/lib/python3.10/site-packages/timm/models/swin_transformer.py", line 183, in forward
    attn = (q @ k.transpose(-2, -1))
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasGemmStridedBatchedExFix(handle, opa, opb, (int)m, (int)n, (int)k, (void*)&falpha, a, CUDA_R_16BF, (int)lda, stridea, b, CUDA_R_16BF, (int)ldb, strideb, (void*)&fbeta, c, CUDA_R_16BF, (int)ldc, stridec, (int)num_batches, CUDA_R_32F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)`

lukas-blecher commented 1 year ago

What torch version is that? And what CUDA version?

thistleknot commented 1 year ago

ubuntu 22 (wsl) nvidia-smi shows cuda 11.6 torch 2.0.1+cu117 installed cuda is 11.7

lukas-blecher commented 1 year ago

I was not able to reproduce this. I have nvidia-cublas-cu11==11.10.3.66. See my full env below

Conda list

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
_openmp_mutex             5.1                       1_gnu  
aiohttp                   3.8.5                    pypi_0    pypi
aiosignal                 1.3.1                    pypi_0    pypi
albumentations            1.3.1                    pypi_0    pypi
annotated-types           0.5.0                    pypi_0    pypi
anyio                     3.7.1                    pypi_0    pypi
arrow                     1.2.3                    pypi_0    pypi
async-timeout             4.0.3                    pypi_0    pypi
attrs                     23.1.0                   pypi_0    pypi
backoff                   2.2.1                    pypi_0    pypi
beautifulsoup4            4.12.2                   pypi_0    pypi
blessed                   1.20.0                   pypi_0    pypi
bzip2                     1.0.8                h7b6447c_0  
ca-certificates           2023.08.22           h06a4308_0  
certifi                   2023.7.22                pypi_0    pypi
charset-normalizer        3.2.0                    pypi_0    pypi
click                     8.1.7                    pypi_0    pypi
cmake                     3.27.4.1                 pypi_0    pypi
croniter                  1.4.1                    pypi_0    pypi
datasets                  2.14.5                   pypi_0    pypi
dateutils                 0.6.12                   pypi_0    pypi
deepdiff                  6.5.0                    pypi_0    pypi
dill                      0.3.7                    pypi_0    pypi
exceptiongroup            1.1.3                    pypi_0    pypi
fastapi                   0.103.1                  pypi_0    pypi
filelock                  3.12.4                   pypi_0    pypi
frozenlist                1.4.0                    pypi_0    pypi
fsspec                    2023.6.0                 pypi_0    pypi
h11                       0.14.0                   pypi_0    pypi
huggingface-hub           0.17.1                   pypi_0    pypi
idna                      3.4                      pypi_0    pypi
imageio                   2.31.3                   pypi_0    pypi
inquirer                  3.1.3                    pypi_0    pypi
itsdangerous              2.1.2                    pypi_0    pypi
jinja2                    3.1.2                    pypi_0    pypi
joblib                    1.3.2                    pypi_0    pypi
lazy-loader               0.3                      pypi_0    pypi
ld_impl_linux-64          2.38                 h1181459_1  
levenshtein               0.21.1                   pypi_0    pypi
libffi                    3.4.4                h6a678d5_0  
libgcc-ng                 11.2.0               h1234567_1  
libgomp                   11.2.0               h1234567_1  
libstdcxx-ng              11.2.0               h1234567_1  
libuuid                   1.41.5               h5eee18b_0  
lightning                 2.0.9                    pypi_0    pypi
lightning-cloud           0.5.38                   pypi_0    pypi
lightning-utilities       0.9.0                    pypi_0    pypi
lit                       16.0.6                   pypi_0    pypi
markdown-it-py            3.0.0                    pypi_0    pypi
markupsafe                2.1.3                    pypi_0    pypi
mdurl                     0.1.2                    pypi_0    pypi
mpmath                    1.3.0                    pypi_0    pypi
multidict                 6.0.4                    pypi_0    pypi
multiprocess              0.70.15                  pypi_0    pypi
munch                     4.0.0                    pypi_0    pypi
ncurses                   6.4                  h6a678d5_0  
networkx                  3.1                      pypi_0    pypi
nltk                      3.8.1                    pypi_0    pypi
nougat-ocr                0.1.7                    pypi_0    pypi
numpy                     1.25.2                   pypi_0    pypi
nvidia-cublas-cu11        11.10.3.66               pypi_0    pypi
nvidia-cuda-cupti-cu11    11.7.101                 pypi_0    pypi
nvidia-cuda-nvrtc-cu11    11.7.99                  pypi_0    pypi
nvidia-cuda-runtime-cu11  11.7.99                  pypi_0    pypi
nvidia-cudnn-cu11         8.5.0.96                 pypi_0    pypi
nvidia-cufft-cu11         10.9.0.58                pypi_0    pypi
nvidia-curand-cu11        10.2.10.91               pypi_0    pypi
nvidia-cusolver-cu11      11.4.0.1                 pypi_0    pypi
nvidia-cusparse-cu11      11.7.4.91                pypi_0    pypi
nvidia-nccl-cu11          2.14.3                   pypi_0    pypi
nvidia-nvtx-cu11          11.7.91                  pypi_0    pypi
opencv-python-headless    4.8.0.76                 pypi_0    pypi
openssl                   3.0.10               h7f8727e_2  
ordered-set               4.1.0                    pypi_0    pypi
orjson                    3.9.7                    pypi_0    pypi
packaging                 23.1                     pypi_0    pypi
pandas                    2.1.0                    pypi_0    pypi
pillow                    10.0.0                   pypi_0    pypi
pip                       23.2.1          py310h06a4308_0  
psutil                    5.9.5                    pypi_0    pypi
pyarrow                   13.0.0                   pypi_0    pypi
pydantic                  2.1.1                    pypi_0    pypi
pydantic-core             2.4.0                    pypi_0    pypi
pygments                  2.16.1                   pypi_0    pypi
pyjwt                     2.8.0                    pypi_0    pypi
pymupdf                   1.23.3                   pypi_0    pypi
pymupdfb                  1.23.3                   pypi_0    pypi
python                    3.10.13              h955ad1f_0  
python-dateutil           2.8.2                    pypi_0    pypi
python-editor             1.0.4                    pypi_0    pypi
python-levenshtein        0.21.1                   pypi_0    pypi
python-multipart          0.0.6                    pypi_0    pypi
pytorch-lightning         2.0.9                    pypi_0    pypi
pytz                      2023.3.post1             pypi_0    pypi
pywavelets                1.4.1                    pypi_0    pypi
pyyaml                    6.0.1                    pypi_0    pypi
qudida                    0.0.4                    pypi_0    pypi
rapidfuzz                 3.3.0                    pypi_0    pypi
readchar                  4.0.5                    pypi_0    pypi
readline                  8.2                  h5eee18b_0  
regex                     2023.8.8                 pypi_0    pypi
requests                  2.31.0                   pypi_0    pypi
rich                      13.5.2                   pypi_0    pypi
ruamel-yaml               0.17.32                  pypi_0    pypi
ruamel-yaml-clib          0.2.7                    pypi_0    pypi
safetensors               0.3.3                    pypi_0    pypi
scikit-image              0.21.0                   pypi_0    pypi
scikit-learn              1.3.0                    pypi_0    pypi
scipy                     1.11.2                   pypi_0    pypi
sconf                     0.2.5                    pypi_0    pypi
sentencepiece             0.1.99                   pypi_0    pypi
setuptools                68.0.0          py310h06a4308_0  
six                       1.16.0                   pypi_0    pypi
sniffio                   1.3.0                    pypi_0    pypi
soupsieve                 2.5                      pypi_0    pypi
sqlite                    3.41.2               h5eee18b_0  
starlette                 0.27.0                   pypi_0    pypi
starsessions              1.3.0                    pypi_0    pypi
sympy                     1.12                     pypi_0    pypi
threadpoolctl             3.2.0                    pypi_0    pypi
tifffile                  2023.8.30                pypi_0    pypi
timm                      0.5.4                    pypi_0    pypi
tk                        8.6.12               h1ccaba5_0  
tokenizers                0.13.3                   pypi_0    pypi
torch                     2.0.1                    pypi_0    pypi
torchmetrics              1.1.2                    pypi_0    pypi
torchvision               0.15.2                   pypi_0    pypi
tqdm                      4.66.1                   pypi_0    pypi
traitlets                 5.10.0                   pypi_0    pypi
transformers              4.33.1                   pypi_0    pypi
triton                    2.0.0                    pypi_0    pypi
typing-extensions         4.7.1                    pypi_0    pypi
tzdata                    2023.3                   pypi_0    pypi
urllib3                   2.0.4                    pypi_0    pypi
uvicorn                   0.23.2                   pypi_0    pypi
wcwidth                   0.2.6                    pypi_0    pypi
websocket-client          1.6.3                    pypi_0    pypi
websockets                11.0.3                   pypi_0    pypi
wheel                     0.38.4          py310h06a4308_0  
xxhash                    3.3.0                    pypi_0    pypi
xz                        5.4.2                h5eee18b_0  
yarl                      1.9.2                    pypi_0    pypi
zlib                      1.2.13               h5eee18b_0

thistleknot commented 1 year ago

https://huggingface.co/databricks/dolly-v2-12b/discussions/21 ' I think this can also arise as an "out of memory" error. Please, it's more helpful if people say how they are running this, and whether you've ruled out what is in previous comments!' I have 4GB of Vram. Maybe I should try a smaller document =D

On Fri, Sep 15, 2023 at 7:07 AM Lukas Blecher @.***> wrote:

I was not able to reproduce this. I have nvidia-cublas-cu11==11.10.3.66 . See my full env below Conda list

Name Version Build Channel

_libgcc_mutex 0.1 main _openmp_mutex 5.1 1_gnu aiohttp 3.8.5 pypi_0 pypi aiosignal 1.3.1 pypi_0 pypi albumentations 1.3.1 pypi_0 pypi annotated-types 0.5.0 pypi_0 pypi anyio 3.7.1 pypi_0 pypi arrow 1.2.3 pypi_0 pypi async-timeout 4.0.3 pypi_0 pypi attrs 23.1.0 pypi_0 pypi backoff 2.2.1 pypi_0 pypi beautifulsoup4 4.12.2 pypi_0 pypi blessed 1.20.0 pypi_0 pypi bzip2 1.0.8 h7b6447c_0 ca-certificates 2023.08.22 h06a4308_0 certifi 2023.7.22 pypi_0 pypi charset-normalizer 3.2.0 pypi_0 pypi click 8.1.7 pypi_0 pypi cmake 3.27.4.1 pypi_0 pypi croniter 1.4.1 pypi_0 pypi datasets 2.14.5 pypi_0 pypi dateutils 0.6.12 pypi_0 pypi deepdiff 6.5.0 pypi_0 pypi dill 0.3.7 pypi_0 pypi exceptiongroup 1.1.3 pypi_0 pypi fastapi 0.103.1 pypi_0 pypi filelock 3.12.4 pypi_0 pypi frozenlist 1.4.0 pypi_0 pypi fsspec 2023.6.0 pypi_0 pypi h11 0.14.0 pypi_0 pypi huggingface-hub 0.17.1 pypi_0 pypi idna 3.4 pypi_0 pypi imageio 2.31.3 pypi_0 pypi inquirer 3.1.3 pypi_0 pypi itsdangerous 2.1.2 pypi_0 pypi jinja2 3.1.2 pypi_0 pypi joblib 1.3.2 pypi_0 pypi lazy-loader 0.3 pypi_0 pypi ld_impl_linux-64 2.38 h1181459_1 levenshtein 0.21.1 pypi_0 pypi libffi 3.4.4 h6a678d5_0 libgcc-ng 11.2.0 h1234567_1 libgomp 11.2.0 h1234567_1 libstdcxx-ng 11.2.0 h1234567_1 libuuid 1.41.5 h5eee18b_0 lightning 2.0.9 pypi_0 pypi lightning-cloud 0.5.38 pypi_0 pypi lightning-utilities 0.9.0 pypi_0 pypi lit 16.0.6 pypi_0 pypi markdown-it-py 3.0.0 pypi_0 pypi markupsafe 2.1.3 pypi_0 pypi mdurl 0.1.2 pypi_0 pypi mpmath 1.3.0 pypi_0 pypi multidict 6.0.4 pypi_0 pypi multiprocess 0.70.15 pypi_0 pypi munch 4.0.0 pypi_0 pypi ncurses 6.4 h6a678d5_0 networkx 3.1 pypi_0 pypi nltk 3.8.1 pypi_0 pypi nougat-ocr 0.1.7 pypi_0 pypi numpy 1.25.2 pypi_0 pypi nvidia-cublas-cu11 11.10.3.66 pypi_0 pypi nvidia-cuda-cupti-cu11 11.7.101 pypi_0 pypi nvidia-cuda-nvrtc-cu11 11.7.99 pypi_0 pypi nvidia-cuda-runtime-cu11 11.7.99 pypi_0 pypi nvidia-cudnn-cu11 8.5.0.96 pypi_0 pypi nvidia-cufft-cu11 10.9.0.58 pypi_0 pypi nvidia-curand-cu11 10.2.10.91 pypi_0 pypi nvidia-cusolver-cu11 11.4.0.1 pypi_0 pypi nvidia-cusparse-cu11 11.7.4.91 pypi_0 pypi nvidia-nccl-cu11 2.14.3 pypi_0 pypi nvidia-nvtx-cu11 11.7.91 pypi_0 pypi opencv-python-headless 4.8.0.76 pypi_0 pypi openssl 3.0.10 h7f8727e_2 ordered-set 4.1.0 pypi_0 pypi orjson 3.9.7 pypi_0 pypi packaging 23.1 pypi_0 pypi pandas 2.1.0 pypi_0 pypi pillow 10.0.0 pypi_0 pypi pip 23.2.1 py310h06a4308_0 psutil 5.9.5 pypi_0 pypi pyarrow 13.0.0 pypi_0 pypi pydantic 2.1.1 pypi_0 pypi pydantic-core 2.4.0 pypi_0 pypi pygments 2.16.1 pypi_0 pypi pyjwt 2.8.0 pypi_0 pypi pymupdf 1.23.3 pypi_0 pypi pymupdfb 1.23.3 pypi_0 pypi python 3.10.13 h955ad1f_0 python-dateutil 2.8.2 pypi_0 pypi python-editor 1.0.4 pypi_0 pypi python-levenshtein 0.21.1 pypi_0 pypi python-multipart 0.0.6 pypi_0 pypi pytorch-lightning 2.0.9 pypi_0 pypi pytz 2023.3.post1 pypi_0 pypi pywavelets 1.4.1 pypi_0 pypi pyyaml 6.0.1 pypi_0 pypi qudida 0.0.4 pypi_0 pypi rapidfuzz 3.3.0 pypi_0 pypi readchar 4.0.5 pypi_0 pypi readline 8.2 h5eee18b_0 regex 2023.8.8 pypi_0 pypi requests 2.31.0 pypi_0 pypi rich 13.5.2 pypi_0 pypi ruamel-yaml 0.17.32 pypi_0 pypi ruamel-yaml-clib 0.2.7 pypi_0 pypi safetensors 0.3.3 pypi_0 pypi scikit-image 0.21.0 pypi_0 pypi scikit-learn 1.3.0 pypi_0 pypi scipy 1.11.2 pypi_0 pypi sconf 0.2.5 pypi_0 pypi sentencepiece 0.1.99 pypi_0 pypi setuptools 68.0.0 py310h06a4308_0 six 1.16.0 pypi_0 pypi sniffio 1.3.0 pypi_0 pypi soupsieve 2.5 pypi_0 pypi sqlite 3.41.2 h5eee18b_0 starlette 0.27.0 pypi_0 pypi starsessions 1.3.0 pypi_0 pypi sympy 1.12 pypi_0 pypi threadpoolctl 3.2.0 pypi_0 pypi tifffile 2023.8.30 pypi_0 pypi timm 0.5.4 pypi_0 pypi tk 8.6.12 h1ccaba5_0 tokenizers 0.13.3 pypi_0 pypi torch 2.0.1 pypi_0 pypi torchmetrics 1.1.2 pypi_0 pypi torchvision 0.15.2 pypi_0 pypi tqdm 4.66.1 pypi_0 pypi traitlets 5.10.0 pypi_0 pypi transformers 4.33.1 pypi_0 pypi triton 2.0.0 pypi_0 pypi typing-extensions 4.7.1 pypi_0 pypi tzdata 2023.3 pypi_0 pypi urllib3 2.0.4 pypi_0 pypi uvicorn 0.23.2 pypi_0 pypi wcwidth 0.2.6 pypi_0 pypi websocket-client 1.6.3 pypi_0 pypi websockets 11.0.3 pypi_0 pypi wheel 0.38.4 py310h06a4308_0 xxhash 3.3.0 pypi_0 pypi xz 5.4.2 h5eee18b_0 yarl 1.9.2 pypi_0 pypi zlib 1.2.13 h5eee18b_0

— Reply to this email directly, view it on GitHub https://github.com/facebookresearch/nougat/issues/70#issuecomment-1721343454, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHKKOTJH5DDPEPWRSXCW6DX2ROL5ANCNFSM6AAAAAA4ZEN324 . You are receiving this because you authored the thread.Message ID: @.***>

thistleknot commented 1 year ago

that wasn't it... it throws that error immediately with a smaller document I'm going to try docker

lukas-blecher commented 1 year ago

maybe try batch size 1 nougat -b 1 file.pdf

thistleknot commented 1 year ago

same. I'll try in docker, and let you know. I have a rocky linux 9 machine, but it's training a model atm. So I only have these 4GB to play with. However, I have lxc and docker, both with gpu pass through, so I should be able to test this out in a container, and 95% of the time (so far 100% with cuda) I've been able to test in a container (for example, I've had woes with gpt4all in arch)

On Fri, Sep 15, 2023 at 9:13 AM Lukas Blecher @.***> wrote:

maybe try batch size 1 nougat -b 1 file.pdf

— Reply to this email directly, view it on GitHub https://github.com/facebookresearch/nougat/issues/70#issuecomment-1721530048, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHKKOUQPWRRNXK3LUGIMW3X2R5DFANCNFSM6AAAAAA4ZEN324 . You are receiving this because you authored the thread.Message ID: @.***>

willocho commented 5 months ago

Were you able to solve this issue? I'm running into the same issue and I think it might be a mismatch between the CUDA version installed on my machine and the one used to build pytorch.

facebookresearch / nougat

RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasGemmStridedBatchedExFix #70

Name Version Build Channel