Open Shawnzheng011019 opened 1 year ago
environment was set automatically by the file requiremets.txt
同样遇到这个问题,看起来应该是adaseq加载数据集的时候,可能处理逻辑有问题,加载数据集的格式
···text data_type: json_spans ···
可能有点问题
是因为数据集找不到或者数据集不是标准的解析格式,可以按照toy msra的加载代码重写一下数据加载
@PPPP-kaqiu 你重新写了吗?可以分享一下吗
@Shawnzheng011019 请问解决了吗,大哥
完全按照hf dataset的格式写数据加载脚本,yaml的数据加载就只写数据那个文件夹就好了
@PPPP-kaqiu 加个微信吧大哥,求教啊WX:Xugeyuan923
完全按照hf dataset的格式写数据加载脚本,yaml的数据加载就只写数据那个文件夹就好了
大神您好可以分享一下怎么解决的吗
@PPPP-kaqiu 佬儿,可以加一下联系方式指导一下吗?vx:j626850567
What is your question?
Traceback (most recent call last):
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\datasets\builder.py", line 1618, in _prepare_split_single writer = writer_class( File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\datasets\arrow_writer.py", line 334, in init self.stream = self._fs.open(fs_token_paths[2][0], "wb") File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\fsspec\spec.py", line 1309, in open f = self._open( File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\fsspec\implementations\local.py", line 180, in _open return LocalFileOpener(path, mode, fs=self, **kwargs) File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\fsspec\implementations\local.py", line 298, in init self._open() File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\fsspec\implementations\local.py", line 303, in _open self.f = open(self.path, mode=self.mode) FileNotFoundError: [Errno 2] No such file or directory: 'C:/Users/shawn/.cache/huggingface/datasets/named_entity_recognition_dataset_builder/default-c270794ce0d 23d06/0.0.0/db737b9bb893f20fb03d04403a30bf7c033256c212b7e9f0ebc6e9c958535c51.incomplete/named_entity_recognition_dataset_builder-train-00000-00000-of-NNNNN.arro w'
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "C:\Users\shawn\anaconda3\envs\pytorch\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\shawn\anaconda3\envs\pytorch\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Users\shawn\anaconda3\envs\pytorch\Scripts\adaseq.exe__main.py", line 7, in
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\adaseq\main.py", line 13, in run
main(prog='adaseq')
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\adaseq\commands\ init__.py", line 29, in main
args.func(args)
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\adaseq\commands\train.py", line 84, in train_model_from_args
train_model(
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\adaseq\commands\train.py", line 156, in train_model
trainer = build_trainer_from_partial_objects(
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\adaseq\commands\train.py", line 185, in build_trainer_from_partial_objects
dm = DatasetManager.from_config(task=config.task, config.dataset)
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\adaseq\data\dataset_manager.py", line 182, in from_config
hfdataset = hf_load_dataset(path, name=name, kwargs)
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\datasets\load.py", line 1797, in load_dataset
builder_instance.download_and_prepare(
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\datasets\builder.py", line 909, in download_and_prepare
self._download_and_prepare(
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\datasets\builder.py", line 1670, in _download_and_prepare
super()._download_and_prepare(
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\datasets\builder.py", line 1004, in _download_and_prepare
self._prepare_split(split_generator, **prepare_split_kwargs)
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\datasets\builder.py", line 1508, in _prepare_split
for job_id, done, content in self._prepare_split_single(
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\datasets\builder.py", line 1665, in _prepare_split_single
raise DatasetGenerationError("An error occurred while generating the dataset") from e
datasets.builder.DatasetGenerationError: An error occurred while generating the dataset
What have you tried?
set http proxy and successfully conneted to Youtube.
Code (if necessary)
No response
What's your environment?
Code of Conduct