microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
MIT License
19.57k stars 2.49k forks source link

Unable to setup Kosmos-2 with Docker #1241

Closed OrianeN closed 1 year ago

OrianeN commented 1 year ago

I've been trying to do the setup to use Kosmos-2 as described in - but it seems like dependencies conflicts are preventing a successful installation.

I've created a small Dockerfile (but the first time I've tried the given docker run command as well):

# This Dockerfile is meant to reproduce the recommended installation for kosmos-2


RUN apt-get update && apt-get install -q -y ${PACKAGES}
RUN python -m pip install --upgrade pip setuptools

RUN git clone
WORKDIR /workspace/unilm/kosmos-2
RUN bash

My build command was nohup docker build -t kosmos2_img . &> docker_build_kosmos2.log &

Yet in both cases I can read the following dependencies conflicts at the end of the bash script (either run inside the container or during the build):

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
numba 0.56.2 requires llvmlite<0.40,>=0.39.0dev0, but you have llvmlite 0.36.0 which is incompatible.
numba 0.56.2 requires setuptools<60, but you have setuptools 68.0.0 which is incompatible.
onnx 1.12.0 requires protobuf<=3.20.1,>=3.12.2, but you have protobuf 3.20.3 which is incompatible.
scipy 1.6.3 requires numpy<1.23.0,>=1.16.5, but you have numpy 1.23.0 which is incompatible.
tensorboard 2.10.1 requires protobuf<3.20,>=3.9.2, but you have protobuf 3.20.3 which is incompatible.
Successfully installed aiofiles-23.1.0 aiohttp-3.8.5 aiosignal-1.3.1 altair-5.0.1 async-timeout-4.0.2 confection-0.1.1 fastapi-0.101.0 ffmpy-0.3.1 frozenlist-1.4.0 gradio-3.37.0 gradio-client-0.3.0 h11-0.14.0 httpcore-0.17.3 httpx-0.24.1 huggingface-hub-0.16.4 linkify-it-py-1.0.3 multidict-6.0.4 numpy-1.23.0 orjson-3.9.3 pathy-0.10.2 pydantic-1.10.11 pydub-0.25.1 python-multipart-0.0.6 semantic-version-2.10.0 sentencepiece-0.1.99 spacy-3.6.0 spacy-legacy-3.0.12 srsly-2.4.7 starlette-0.27.0 thinc-8.1.10 tiktoken-0.4.0 typing-extensions-4.7.1 uc-micro-py-1.0.2 uvicorn-0.23.2 websockets-11.0.3 yarl-1.9.2

Still I've tried to launch the Gradio demo with bash inside the created container, but I get the following error:

$ bash
/opt/conda/lib/python3.8/site-packages/torch/distributed/ FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects `--local_rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See for
further instructions

WARNING:root:Pytorch pre-release version 1.13.0a0+d0d6b1f - assuming intent to test it
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/xformers/ops/fmha/", line 17, in <module>
    from flash_attn.flash_attn_triton import (
ModuleNotFoundError: No module named 'flash_attn'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "demo/", line 12, in <module>
    import unilm
  File "/workspace/unilm/kosmos-2/./unilm/", line 1, in <module>
    import unilm.models
  File "/workspace/unilm/kosmos-2/./unilm/models/", line 6, in <module>
    import_models(models_dir, "unilm.models")
  File "/opt/conda/lib/python3.8/site-packages/fairseq/models/", line 217, in import_models
    importlib.import_module(namespace + "." + model_name)
  File "/opt/conda/lib/python3.8/importlib/", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/workspace/unilm/kosmos-2/./unilm/models/", line 37, in <module>
    from unilm.models.gpt import GPTmodel, GPTModelConfig
  File "/workspace/unilm/kosmos-2/./unilm/models/", line 39, in <module>
    from torchscale.architecture.decoder import Decoder
  File "/opt/conda/lib/python3.8/site-packages/torchscale/architecture/", line 12, in <module>
    from torchscale.architecture.utils import init_bert_params
  File "/opt/conda/lib/python3.8/site-packages/torchscale/architecture/", line 6, in <module>
    from torchscale.component.multihead_attention import MultiheadAttention
  File "/opt/conda/lib/python3.8/site-packages/torchscale/component/", line 12, in <module>
    from xformers.ops import memory_efficient_attention, LowerTriangularMask, MemoryEfficientAttentionCutlassOp
  File "/opt/conda/lib/python3.8/site-packages/xformers/ops/", line 8, in <module>
    from .fmha import (
  File "/opt/conda/lib/python3.8/site-packages/xformers/ops/fmha/", line 10, in <module>
    from . import cutlass, decoder, flash, small_k, triton
  File "/opt/conda/lib/python3.8/site-packages/xformers/ops/fmha/", line 39, in <module>
    flash_attn = import_module_from_path(
  File "/opt/conda/lib/python3.8/site-packages/xformers/ops/fmha/", line 36, in import_module_from_path
  File "<frozen importlib._bootstrap_external>", line 839, in exec_module
  File "<frozen importlib._bootstrap_external>", line 975, in get_code
  File "<frozen importlib._bootstrap_external>", line 1032, in get_data
FileNotFoundError: [Errno 2] No such file or directory: '/workspace/unilm/kosmos-2/third_party/flash-attention/flash_attn/'
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 692) of binary: /opt/conda/bin/python
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/conda/lib/python3.8/", line 87, in _run_code
    exec(code, run_globals)
  File "/opt/conda/lib/python3.8/site-packages/torch/distributed/", line 195, in <module>
  File "/opt/conda/lib/python3.8/site-packages/torch/distributed/", line 191, in main
  File "/opt/conda/lib/python3.8/site-packages/torch/distributed/", line 176, in launch
  File "/opt/conda/lib/python3.8/site-packages/torch/distributed/", line 753, in run
  File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/", line 132, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/", line 246, in launch_agent
    raise ChildFailedError(
demo/ FAILED
Root Cause (first observed failure):
  time      : 2023-08-08_08:31:21
  host      : e01a31a3a92c
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 692)
  error_file: <N/A>
  traceback : To enable traceback see:

I have tried with and without setting the --privileged argument in the docker run command - btw I don't understand why it would be necessary to put such a non-secure argument in the case of Kosmos-2.

I'm running docker on Ubuntu-18.04, docker version 20.10.24, build 297e128.

OrianeN commented 1 year ago

(Edited) Running pip install flash_attn inside the created container solved the issue, so I'm going to close this issue.

BrainWWW commented 1 year ago

Running apt install flash_attn inside the created container solved the issue, so I'm going to close this issue.

Reading package lists... Done
Building dependency tree       
Reading state information... Done
E: Unable to locate package flash_attn

why can't I find the package 'flash_attn'?even if I update the source using apt-get update

pengzhiliang commented 1 year ago

Hi, @BrainWWW. Thanks for the attention.

I used to run the following code on the hf space:


ENV MPLCONFIGDIR /tmp/matplotlib-config  

COPY . .

RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt

where requirements.txt need to be updated according to now.

According to your feedback, It seems some errors are raised in the installation of xformers. You can try the solution in

Alternatively, huggiing face version is also accessable. You can find it in our readme.

Hope this can help you.

OrianeN commented 1 year ago

Running apt install flash_attn inside the created container solved the issue, so I'm going to close this issue.

Reading package lists... Done
Building dependency tree       
Reading state information... Done
E: Unable to locate package flash_attn

why can't I find the package 'flash_attn'?even if I update the source using apt-get update

I'm really sorry to have mislead you, I'm almost sure I actually ran pip install flash_attn and not apt install...

I will correct my previous post in order to avoid confusing more people reading this issue.