microsoft / MASS

MASS: Masked Sequence to Sequence Pre-training for Language Generation
https://arxiv.org/pdf/1905.02450.pdf
Other
1.12k stars 206 forks source link

StopIteration Error with fairseq-interactive #43

Open magician-david opened 5 years ago

magician-david commented 5 years ago

I can generate outputs with fairseq-generate but fail with fairseq-interactive. I would appreciate it if you have any ideas. Thanks!

Traceback (most recent call last): File "/usr/local/python3/bin/fairseq-interactive", line 11, in load_entry_point('fairseq==0.7.1', 'console_scripts', 'fairseq-interactive')() File "/usr/local/python3/lib/python3.6/site-packages/fairseq_cli/interactive.py", line 185, in cli_main main(args) File "/usr/local/python3/lib/python3.6/site-packages/fairseq_cli/interactive.py", line 121, in main task.max_positions(), File "/home/user/MASS/MASS-fairseq/mass/xmasked_seq2seq.py", line 487, in max_positions for key in next(iter(self.datasets.values())).datasets.keys() StopIteration

huangxianliang commented 4 years ago

i have the same problem, have you solved?

StillKeepTry commented 4 years ago

Can you provide your script for fairseq-interactive.

JasonVann commented 4 years ago

@StillKeepTry I also run into this StopIteration issue, my script is as follows:

model=zh-en/zhen_mass_pre-training.pt data_dir=zh-en/processed user_dir=mass input_file=zh-en/neu_1000.zh

fairseq-interactive $data_dir \ --user-dir $user_dir \ --input $input_file \ -s zh -t en \ --langs en,zh \ --source-langs zh --target-langs en \ --mt_steps zh-en \ --task xmasked_seq2seq \ --path $model \ --cpu

I found that this is because in line 490 of max_positions in xmasked_seq2seq.py, the self.datasets is None. I guess this is because load_dataset() is never called.

JasonVann commented 4 years ago

I found how to fix this. Update max_positions in xmasked_seq2seq.py to this:

def max_positions(self):
    if not self.datasets or len(self.datasets) == 0:
        return (self.args.max_source_positions, self.args.max_target_positions)

    return OrderedDict([
        (key, (self.args.max_source_positions, self.args.max_target_positions))
        for key in next(iter(self.datasets.values())).datasets.keys()
    ])

def build_dataset_for_inference(self, src_tokens, src_lengths):
    return LanguagePairDataset(src_tokens, src_lengths, self.source_dictionary)

Then it works with fairseq-interactive

huangxianliang commented 4 years ago

I found how to fix this. Update max_positions in xmasked_seq2seq.py to this:

def max_positions(self):
    if not self.datasets or len(self.datasets) == 0:
        return (self.args.max_source_positions, self.args.max_target_positions)

    return OrderedDict([
        (key, (self.args.max_source_positions, self.args.max_target_positions))
        for key in next(iter(self.datasets.values())).datasets.keys()
    ])

def build_dataset_for_inference(self, src_tokens, src_lengths):
    return LanguagePairDataset(src_tokens, src_lengths, self.source_dictionary)

Then it works with fairseq-interactive

@JasonVann hi,then new error occur:AttributeError: 'XMassTranslationTask' object has no attribute 'build_dataset_for_inference'