`'NoneType' object is not subscriptable` when running fairseq-train

What is your question?

I'm new to fairseq and am trying to train a simple LSTM-based model for a grapheme-to-phoneme conversion task, using a command similar to the one here. I have five different datasets that I've generated with the fairseq-preprocess command, and I'm able to complete model training for four of them. However, on the fifth (and smallest) one, I get the following error (full output here):

-- Process 1 terminated with the following error:
Traceback (most recent call last):
  File "/homes/gws/echau18/miniconda3/envs/loanwords/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
    fn(i, *args)
  File "/homes/gws/echau18/lib/fairseq/fairseq_cli/train.py", line 296, in distributed_main
    main(args, init_distributed=True)
  File "/homes/gws/echau18/lib/fairseq/fairseq_cli/train.py", line 86, in main
    train(args, trainer, task, epoch_itr)
  File "/homes/gws/echau18/lib/fairseq/fairseq_cli/train.py", line 127, in train
    log_output = trainer.train_step(samples)
  File "/homes/gws/echau18/lib/fairseq/fairseq/trainer.py", line 330, in train_step
    sample, self.model, self.criterion, self.optimizer, ignore_grad
  File "/homes/gws/echau18/lib/fairseq/fairseq/tasks/fairseq_task.py", line 251, in train_step
    loss, sample_size, logging_output = criterion(model, sample)
  File "/homes/gws/echau18/miniconda3/envs/loanwords/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/homes/gws/echau18/lib/fairseq/fairseq/criterions/cross_entropy.py", line 28, in forward
    net_output = model(**sample['net_input'])
TypeError: 'NoneType' object is not subscriptable

The other issues I've seen involving this error have involved OOMs or bugs that have been fixed, but since my dataset is very small (1173 examples over two Titan X GPUs) I don't think OOM is the problem. Any pointers on how to approach this? Thanks in advance!

What have you tried?

I've tried printing out sample and sample['net_input'] and confirmed that both are not None. Not really sure where else to look.

What's your environment?

fairseq Version (e.g., 1.0 or master): 0.9.0
PyTorch Version (e.g., 1.0): 1.4.0
OS (e.g., Linux): Linux
How you installed fairseq (pip, source): source
Build command you used (if compiling from source): pip install --editable ./
Python version: 3.7.9
CUDA/cuDNN version: 9.2
GPU models and configuration:
Any other relevant information:

facebookresearch / fairseq