Docker build error with fairseq - feb5f07

osddeitf commented 3 years ago

I've pull the latest commit of this repo, tried run docker build, and got this error:

WARNING: You are using pip version 19.3; however, version 20.3.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Traceback (most recent call last):
  File "examples/speech_recognition/infer.py", line 17, in <module>
    import editdistance
ModuleNotFoundError: No module named 'editdistance'
The command '/bin/sh -c pip install --editable ./ && python examples/speech_recognition/infer.py --help && python examples/wav2vec/recognize.py --help' returned a non-zero code: 1

After fix it up by add pip install editdistance, I've run into this:

/usr/local/lib/python3.7/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at  /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
  return torch._C._cuda_getDeviceCount() > 0
Traceback (most recent call last):
  File "examples/wav2vec/recognize.py", line 10, in <module>
    from fairseq.models.wav2vec.wav2vec2_asr import base_architecture, Wav2VecEncoder
ImportError: cannot import name 'base_architecture' from 'fairseq.models.wav2vec.wav2vec2_asr' (/app/fairseq/fairseq/models/wav2vec/wav2vec2_asr.py)
The command '/bin/sh -c pip install --editable ./ && python examples/speech_recognition/infer.py --help && python examples/wav2vec/recognize.py --help' returned a non-zero code: 1

I tried wav2letter.Dockerfile, but still got the above error. Environment:

Docker 20.10.
Amazon EC2 Ubuntu 18.04 instance (Linux 5.4.0-1029-aws).

I think torch still require GPU to install, or new version of torch required it. Have you guys run into this? Or should we update the Dockerfile(s).

osddeitf commented 3 years ago

Oh, sorry for bothering, I've not read error output thoroughly, forget the CUDA part, the code of fairseq itself had been changed since your last commit then. Could you guys point out which commit of fairseq did you use? And I think you should fix the Dockerfile to clone specific commit hash. Thanks in advance.

osddeitf commented 3 years ago

I've inspect recognize.py and see this:

from fairseq.models.wav2vec.wav2vec2_asr import base_architecture, Wav2VecEncoder

Despite base_architecture are defined later at recognize.py:25, as commit 3a1df30412869ab3a031629d6c4e7edd88f17e9e add this function. So, I'm gonna test by removing it from import and test. If it's work, I'll make a PR.

osddeitf commented 3 years ago

After fix the above import issue, I had another error:

args.no_pretrained_weights = getattr(args, "no_pretrained_weights", False)
AttributeError: 'NoneType' object has no attribute 'no_pretrained_weights'

Because my finetuned model not have args either (it's None), so the following code in recognize.py are not work.

def load_model(model_path, target_dict):
    w2v = torch.load(model_path)
    model = Wav2VecCtc.build_model(w2v["args"], target_dict)
    model.load_state_dict(w2v["model"], strict=True)

    return [model]

The problem is I finetune the model using newer version of fairseq, which use hydra, which might not save the training args. I'm trying to write my own implementation for recognize.py to use hydra, for loading the same configuration as used when finetune model, so that it might work.

raja1196 commented 3 years ago

Can you make a pull request as a separate recognize.py file once you are able to run with hydra?

osddeitf commented 3 years ago

Now that you mention it, I've managed to run it with hydra, so I'll make ones.

osddeitf commented 3 years ago

@raja1196 thanks for submitting a pull request removing import base_architecture for me, I'll close this issue.

osddeitf commented 3 years ago

@raja1196 I've opened a PR, could you check it out and test for me? #8

raja1196 commented 3 years ago

I will test it out, but it looks like your PR is merged.

loretoparisi commented 3 years ago

I just merged please test if it does not work we can revert back.

osddeitf commented 3 years ago

I'm reproducing all the step needed to get it to work, and found that Dockerfile accidentally not working and I fixed like below:

    RUN git clone https://github.com/pytorch/fairseq --depth=1 && cd fairseq && \
+++     git fetch origin ac11107ed41cb06a758af850373c239309d1c961 && \
        git checkout ac11107ed41cb06a758af850373c239309d1c961 && \
        pip install --editable .

With Docker caching, in my compute instance, it's working without git fetch, but when I rebuild with --no-cache, it's ended up like that. @loretoparisi it's dangerous that you merged my PR too soon, I'd prefer test it before merge. Anyway, I'll ensure everything work correctly and then make a new PR.

osddeitf commented 3 years ago

I opened #9 to fix #8, it should working now.

PS: I would be glad if you add me to your README, section Contributors :).

loretoparisi commented 3 years ago

@osddeitf yes definitely I will, thank you in advance.

loretoparisi / wave2vec-recognize-docker

Docker build error with fairseq - feb5f07 #5