Closed shanestorks closed 2 years ago
No idea, maybe something changed in the data source?
There's a resource missing from the nltk installation which causes the processing of all records in the SNLI files to fail. The issue can be resolved by running nltk.download('punkt')
in Python.
I'm having the following error when trying to re-train the model on SNLI (configured for testing on SPRL):
Traceback (most recent call last): File "train.py", line 423, in <module> main(args) File "train.py", line 328, in main args.max_train_sents, args.max_val_sents, args.max_test_sents, args.remove_dup) File "/home/sstorks/robust-nli/src/data.py", line 82, in get_nli_text train = extract_from_file(train_lbls_file, train_src_file, max_train_sents, "train", remove_dup) File "/home/sstorks/robust-nli/src/data.py", line 42, in extract_from_file assert len(lbls) == len(srcs), "%s: %s labels and source files are not same length" % (lbls_file, data_split) AssertionError: ../data/snli_1.0/cl_snli_train_lbl_file: train labels and source files are not same length
I used the provided script to download the data. Any ideas why this would be happening? Thank you!