facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
30.04k stars 6.35k forks source link

No text output when running fairseq-interactive with 125M dense MoE model #4937

Open FayZ676 opened 1 year ago

FayZ676 commented 1 year ago

What is your question?

When I run the following fairseq-interactive command and input some text, I don't get any text returned.

fairseq-interactive data-bin/captions \
    --batch-size 1 --buffer-size 2 \
    --path checkpoints/checkpoint_best.pt \
    --task language_modeling \
    --max-sentences 1

What have you tried?

For context, I ran the following scripts for preprocessing and training

# PREPROCESSING
fairseq-preprocess --destdir data-bin/captions \
    --only-source \
    --task language_modeling \
    --srcdict en_dense_lm_125m/dict.txt \
    --trainpref caption_datasets/howard/fairseq/train.txt \
    --validpref caption_datasets/howard/fairseq/validation.txt \
    --testpref caption_datasets/howard/fairseq/test.txt

#TRAINING
CUDA_VISIBLE_DEVICES=0 \
fairseq-train data-bin/captions \
    --finetune-from-model en_dense_lm_125m/model.pt \
    --task language_modeling --tokens-per-sample 512 \
    --arch transformer_lm_gpt \
    --batch-size 32 \
    --max-split-size \
    --optimizer adam --adam-eps 1e-06 --adam-betas '(0.9, 0.98)' --clip-norm 0.0 \
    --max-epoch 5 \
    --lr 0.01 \
    --fp16

Sample Image of Output

The command line just stays like this

Screen Shot 2023-01-07 at 12 28 22 PM

Environment

-fairseq Version (e.g., 1.0 or main): 0.12.2 -PyTorch Version (e.g., 1.0): 1.13.1 -OS (e.g., Linux): Ubuntu 22.04 -How you installed fairseq (pip, source): source -Build command you used (if compiling from source): pip install --editable ./ -Python version: 3.8.13 -CUDA/cuDNN version: 11.7 -GPU models and configuration: 4x NVIDIA 3090

AlexNLP commented 1 year ago

How do you resolve this PR in the end