ThilinaRajapakse / pytorch-transformers-classification

Based on the Pytorch-Transformers library by HuggingFace. To be used as a starting point for employing Transformer models in text classification tasks. Contains code to easily train BERT, XLNet, RoBERTa, and XLM models for text classification.
Apache License 2.0
306 stars 97 forks source link

File path problem #26

Closed nalcvp closed 5 years ago

nalcvp commented 5 years ago

FileNotFoundError Traceback (most recent call last)

in 1 if args['do_train']: ----> 2 train_dataset = load_and_cache_examples(task, tokenizer) 3 global_step, tr_loss = train(train_dataset, model, tokenizer) 4 logger.info(" global_step = %s, average loss = %s", global_step, tr_loss) in load_and_cache_examples(task, tokenizer, evaluate) 13 logger.info("Creating features from dataset file at %s", args['data_dir']) 14 label_list = processor.get_labels() ---> 15 examples = processor.get_dev_examples(args['data_dir']) if evaluate else processor.get_train_examples(args['data_dir']) 16 17 if __name__ == "__main__": ~\data\utils.py in get_train_examples(self, data_dir) 98 """See base class.""" 99 return self._create_examples( --> 100 self._read_tsv(os.path.join(data_dir, "train.tsv")), "train") 101 102 def get_dev_examples(self, data_dir): ~\data\utils.py in _read_tsv(cls, input_file, quotechar) 82 def _read_tsv(cls, input_file, quotechar=None): 83 """Reads a tab separated value file.""" ---> 84 with open(input_file, "r", encoding="utf-8-sig") as f: 85 reader = csv.reader(f, delimiter="\t", quotechar=quotechar) 86 lines = [] FileNotFoundError: [Errno 2] No such file or directory: 'data/train.tsv' I have the train.tsv file under this file path but the code from this step **if args['do_train']: train_dataset = load_and_cache_examples(task, tokenizer) global_step, tr_loss = train(train_dataset, model, tokenizer) logger.info(" global_step = %s, average loss = %s", global_step, tr_loss)** is giving me the error above. How can I edit the code so that this error isn't appearing?
ThilinaRajapakse commented 5 years ago

I can't reproduce this error. It works for me as long as the train.tsv file is present. Try printing out the contents of the data directory from inside the notebook to check if filepaths are correct.

import os

print(os.listdir('data/')
nalcvp commented 5 years ago

Is data the name of the folder that all of these files go into?

ThilinaRajapakse commented 5 years ago

The data folder contains the data files only. It should be at the same level as the notebook.

Also, that link is to a file on your local machine.