Closed harrywang closed 1 year ago
You need to use your dataset by passing its folder path instead of file path
I tried
chinese_sets = './datasets/666.hotel'
and the error is:
Try to load ['./datasets/666.hotel'] dataset from local
Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?337dc701-20e9-465c-b0f9-e1b5ce0b520f)
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
/Users/harrywang/sandbox/hotel-service-bot/apc-train-chinese/train.ipynb Cell 2 in <cell line: 10>()
[6](vscode-notebook-cell:/Users/harrywang/sandbox/hotel-service-bot/apc-train-chinese/train.ipynb#W1sZmlsZQ%3D%3D?line=5) # config.spacy_model = 'zh_core_web_sm'
[7](vscode-notebook-cell:/Users/harrywang/sandbox/hotel-service-bot/apc-train-chinese/train.ipynb#W1sZmlsZQ%3D%3D?line=6) # chinese_sets = ABSADatasetList.Chinese
[8](vscode-notebook-cell:/Users/harrywang/sandbox/hotel-service-bot/apc-train-chinese/train.ipynb#W1sZmlsZQ%3D%3D?line=7) # chinese_sets = ABSADatasetList.Chinese
[9](vscode-notebook-cell:/Users/harrywang/sandbox/hotel-service-bot/apc-train-chinese/train.ipynb#W1sZmlsZQ%3D%3D?line=8) chinese_sets = './datasets/666.hotel'
---> [10](vscode-notebook-cell:/Users/harrywang/sandbox/hotel-service-bot/apc-train-chinese/train.ipynb#W1sZmlsZQ%3D%3D?line=9) sent_classifier = Trainer(config=config, # set config=None to use default model
[11](vscode-notebook-cell:/Users/harrywang/sandbox/hotel-service-bot/apc-train-chinese/train.ipynb#W1sZmlsZQ%3D%3D?line=10) dataset=chinese_sets, # train set and test set will be automatically detected
[12](vscode-notebook-cell:/Users/harrywang/sandbox/hotel-service-bot/apc-train-chinese/train.ipynb#W1sZmlsZQ%3D%3D?line=11) checkpoint_save_mode=1,
[13](vscode-notebook-cell:/Users/harrywang/sandbox/hotel-service-bot/apc-train-chinese/train.ipynb#W1sZmlsZQ%3D%3D?line=12) auto_device=True # automatic choose CUDA or CPU
[14](vscode-notebook-cell:/Users/harrywang/sandbox/hotel-service-bot/apc-train-chinese/train.ipynb#W1sZmlsZQ%3D%3D?line=13) ).load_trained_model()
File ~/sandbox/hotel-service-bot/venv/lib/python3.8/site-packages/pyabsa/functional/trainer/trainer.py:118, in Trainer.__init__(self, config, dataset, from_checkpoint, checkpoint_save_mode, auto_device, path_to_save, load_aug)
116 dataset = DatasetItem('custom_dataset', dataset)
117 self.config.dataset_name = dataset.dataset_name
--> 118 self.dataset_file = detect_dataset(dataset, task=self.task, load_aug=load_aug)
119 self.config.dataset_file = self.dataset_file
121 self.config = init_config(self.config, auto_device)
File ~/sandbox/hotel-service-bot/venv/lib/python3.8/site-packages/pyabsa/functional/dataset/dataset_manager.py:203, in detect_dataset(dataset_path, task, load_aug)
201 if len(dataset_file['train']) == 0:
202 if os.path.isdir(d) or os.path.isdir(search_path):
--> 203 print('No train set found from: {}, detected files: {}'.format(dataset_path, ', '.join(os.listdir(d) + os.listdir(search_path))))
204 raise RuntimeError(
...
207 'https://github.com/yangheng95/ABSADatasets#important-rename-your-dataset-filename-before-use-it-in-pyabsa')
208 )
209 if len(dataset_file['test']) == 0:
FileNotFoundError: [Errno 2] No such file or directory: ''
Your dataset shold locate under the apc_dataset or atepc_dataet, depends on what task you are working on.
this works - I did not know this - I cloned the ABSADatasets and put my data there and it loads OK. Thanks!
Hi,
I have prepared our data into train and test as two txt files using the annotation tools:
then, I try to use https://github.com/yangheng95/PyABSA/blob/release/demos/aspect_polarity_classification/train_apc_chinese.py to train using our own data and could not figure out how to structure my files.
I have created the structure below (the py file is the training script)
I changed the dataset path as follows:
but the error says:
Could you please help give some guidance?
Thanks!