OpenBLAS/OpenMP Loop error message

LWprogramming commented 1 year ago

When training, I get this error in the logs, so I'm making an issue here in case anyone else has seen this come up.

OpenBLAS Warning : Detect OpenMP Loop and this application may hang. Please rebuild the library with USE_OPENMP=1 option.

I'm not sure what this warning is supposed to mean, but it seems like things are still training. I also see the output look something like this, although I'm not sure if it's related:

11: loss: 2.9820859730243683
Computing label assignment and total inertia
Computing label assignment and total inertia
Computing label assignment and total inertia
Computing label assignment and total inertia

with many repetitions of that phrase.

Replication:

see here
comment out the assertion in audiolm_pytorch_demo_laion.py
from the training repo, run sbatch sbatch.sh
look for the error message

LWprogramming commented 1 year ago

and a list of packages in case it ends up being relevant:

pip list
Package                  Version
------------------------ ----------
accelerate               0.20.3
antlr4-python3-runtime   4.8
audiolm-pytorch          1.2.1
beartype                 0.14.1
bitarray                 2.7.6
blessed                  1.20.0
certifi                  2023.5.7
cffi                     1.15.1
charset-normalizer       3.1.0
cmake                    3.26.4
colorama                 0.4.6
Cython                   0.29.35
einops                   0.6.1
ema-pytorch              0.2.3
encodec                  0.1.1
fairseq                  0.12.2
filelock                 3.12.2
fsspec                   2023.6.0
gpustat                  1.1
huggingface-hub          0.15.1
hydra-core               1.0.7
idna                     3.4
Jinja2                   3.1.2
joblib                   1.2.0
lion-pytorch             0.1.2
lit                      16.0.6
local-attention          1.8.6
lxml                     4.9.2
MarkupSafe               2.1.3
mpmath                   1.3.0
networkx                 3.1
numpy                    1.25.0
nvidia-cublas-cu11       11.10.3.66
nvidia-cuda-cupti-cu11   11.7.101
nvidia-cuda-nvrtc-cu11   11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11        8.5.0.96
nvidia-cufft-cu11        10.9.0.58
nvidia-curand-cu11       10.2.10.91
nvidia-cusolver-cu11     11.4.0.1
nvidia-cusparse-cu11     11.7.4.91
nvidia-ml-py             11.525.112
nvidia-nccl-cu11         2.14.3
nvidia-nvtx-cu11         11.7.91
omegaconf                2.0.6
packaging                23.1
pip                      23.0.1
portalocker              2.7.0
protobuf                 4.23.3
psutil                   5.9.5
pycparser                2.21
PyYAML                   6.0
regex                    2023.6.3
requests                 2.31.0
sacrebleu                2.3.1
safetensors              0.3.1
scikit-learn             0.24.0
scipy                    1.11.0
sentencepiece            0.1.99
setuptools               65.5.0
six                      1.16.0
sympy                    1.12
tabulate                 0.9.0
tensorboardX             2.6.1
threadpoolctl            3.1.0
tokenizers               0.13.3
torch                    2.0.1
torchaudio               2.0.2
tqdm                     4.65.0
transformers             4.30.2
triton                   2.0.0
typing_extensions        4.6.3
urllib3                  2.0.3
vector-quantize-pytorch  1.6.24
wcwidth                  0.2.6
wheel                    0.40.0

I'm not quite sure when exactly this issue came up, but my first guess is that it might've been in some dependency a few versions back somewhere. Will keep this updated