hlt-mt / FBK-fairseq

Repository containing the open source code of works published at the FBK MT unit.
Other
42 stars 1 forks source link

Can't get EDATT to work #3

Closed RomanKoshkin closed 1 year ago

RomanKoshkin commented 1 year ago

I cloned the FBK-fairseq repo (https://github.com/hlt-mt/FBK-fairseq.git), installed it following the instructions here, and tried to run the EDATT model on the SiMT task as described here, but it gives me some weird error:

Traceback (most recent call last): File "/work/miniconda/envs/fbk/bin/simuleval", line 8, in sys.exit(main()) File "/work/miniconda/envs/fbk/lib/python3.10/site-packages/simuleval/cli.py", line 165, in main _main(args.client_only) File "/work/miniconda/envs/fbk/lib/python3.10/site-packages/simuleval/cli.py", line 180, in main , agent_cls = find_agent_cls(args) File "/work/miniconda/envs/fbk/lib/python3.10/site-packages/simuleval/utils/agent_finder.py", line 64, in find_agent_cls spec.loader.exec_module(agent_modules) File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed File "/work/FBK-fairseq/examples/speech_to_text/simultaneous_translation/agents/simul_offline_edatt.py", line 18, in from examples.speech_to_text.simultaneous_translation.agents.base_simulst_agent import FairseqSimulSTAgent ModuleNotFoundError: No module named 'examples'

What I am doing wrong?

sarapapi commented 1 year ago

Hi, it seems like an installation error. Can you please tell me which version of SimulEval you are using and what is the exact piece of code you are executing? Thanks I suggest you to install SimulEval using the commit before the v1.1.0 (https://github.com/facebookresearch/SimulEval/commit/03bc52105b47b42708bed0184ebc545651e808cf) since the versioning of the tool is pretty bad right now and appears to be the v1.0.2 but it is actually the v1.1.0 if you install it with pip.

RomanKoshkin commented 1 year ago

Let me try, but I the error looks like it has to do with fairseq, not simuleval. And thanks for the quick reply! 🙏

On Tue, Sep 19, 2023, 3:47 PM sarapapi @.***> wrote:

Hi, it seems like an installation error. Can you please tell me which version of SimulEval you are using and what is the exact piece of code you are executing? Thanks I suggest you to install SimulEval using the commit before the v1.1.0 ( @.*** https://github.com/facebookresearch/SimulEval/commit/03bc52105b47b42708bed0184ebc545651e808cf) since the versioning of the tool is pretty bad right now and appears to be the v1.0.2 but it is actually the v1.1.0 if you install it with pip.

— Reply to this email directly, view it on GitHub https://github.com/hlt-mt/FBK-fairseq/issues/3#issuecomment-1724922344, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGIRPJ5RJCON4BXOE5DVRR3X3E5ZHANCNFSM6AAAAAA45UW5HY . You are receiving this because you authored the thread.Message ID: @.***>

RomanKoshkin commented 1 year ago

Okay, I've install Simuleval from that commit, but I get the same error. Perhaps more context would help. 1) I install the FBK-fairseq (clone, cd FBK-fairseq, pip install -e .) 2) then clone Simuleval, cd Simuleval, pip install -e . 3) the set the environment variables:

export FBK_FAIRSEQ_ROOT=/FBK-fairseq
export SRC_LIST_OF_AUDIO=/S2ST/evaluation/SOURCES/src_ted_new_tst_100.de
export TGT_FILE=/S2ST/evaluation/OFFLINE_TARGETS/tgt_ted_new_tst_100.de
export DATA_ROOT=/FBK-fairseq/data-bin
export ALPHA=0.2
export OUT_DIR=/FBK-fairseq/data-bin
export PORT=2001

4) the run this command:

simuleval \
    --agent ${FBK_FAIRSEQ_ROOT}/examples/speech_to_text/simultaneous_translation/agents/simul_offline_edatt.py \
    --source ${SRC_LIST_OF_AUDIO} \
    --target ${TGT_FILE} \
    --data-bin ${DATA_ROOT} \
    --config ${DATA_ROOT}/config_simul.yaml \
    --model-path ${DATA_ROOT}/checkpoint_avg7.pt \
    --extract-attn-from-layer 3 \
    --frame-num 2 --attn-threshold $ALPHA \
    --speech-segment-factor 8 \
    --output ${OUT_DIR} \
    --port ${PORT} \
    --gpu \
    --scores

And it gives me the above error. What do you think might be the problem? and also, I cannot download source_vocab.txt (says "access denied", but other common files are accessed without a problem). I suspect that my problem is not due to that missing file, but could you please share that file too?

And for simplicity, I put all the files (Common files: gcmvn.npz, source_vocab.model, en-de checkpoint, , config_simul.yaml, target_vocab.model, target_vocab.txt to the same location)

Maybe the latest commit of FBK-fairseq is broken and I need to use some other to get things working?

RomanKoshkin commented 1 year ago

I have fixed the problem (I still don't know what's wrong, but it appears to be with the imports in the fairseq/__init__.py:


import fairseq.criterions  # noqa
import fairseq.models  # noqa
import fairseq.modules  # noqa
# import fairseq.optim  # noqa                            # COMMENTING THESE TWO LINES fixes the issue
# import fairseq.optim.lr_scheduler  # noqa
import fairseq.pdb  # noqa
import fairseq.scoring  # noqa
import fairseq.tasks  # noqa
import fairseq.token_generation_constraints  # noqa

BUT

I still can't run the model, because when I try to download spm_unigram.en.txt, I get

image

Without it, nothing works.

Can you please share this file?

sarapapi commented 1 year ago

Hi, I tried to download the repository from scratch but I don't get any errors on the importing. In theory, you shouldn't comment on the two lines of code, especially because they are not related at all to the examples module (and to the SimulST code that I implemented). Can you please try to use only export PYTHONPATH=${FBK_FAIRSEQ_ROOT} and python 3.8? I think that is something related to the importing and the Python version might be the cause.

I also changed the link, thanks for reporting. Moreover, I updated the code to work with SimulEval 1.1.0, you can find it here.

sarapapi commented 1 year ago

I am closing this as it has been stale for a while. Feel free to reopen if anything else is needed. Thanks.