fastnlp / TENER

Codes for "TENER: Adapting Transformer Encoder for Named Entity Recognition"
373 stars 55 forks source link

调取batch模块读取数据时出现的无限递归问题 #26

Closed boxiaowave closed 3 years ago

boxiaowave commented 3 years ago

作者你好!

尝试在微博数据上复现结果时出现使用DataSetGetter的getattr时出现无限递归问题,尝试修改为字典方法也无效,不知道程序是否在windows上测试过?用的fastNLP版本为5.5,尝试了1.1和1.5.1版本的torch都没有解决这个问题。

D:\Anaconda\envs\torch_env\python.exe E:/BoXiao/TENER-master/train_tener_cn.py --dataset weibo Read cache from caches/weibo_transformer_bmeso_True.pkl. In total 3 datasets: train has 1350 instances. dev has 270 instances. test has 270 instances. In total 3 vocabs: chars has 3356 entries. bigrams has 42184 entries. target has 29 entries.

input fields after batch(if batch size is 2): target: (1)type:torch.Tensor (2)dtype:torch.int64, (3)shape:torch.Size([2, 26]) chars: (1)type:torch.Tensor (2)dtype:torch.int64, (3)shape:torch.Size([2, 26]) bigrams: (1)type:torch.Tensor (2)dtype:torch.int64, (3)shape:torch.Size([2, 26]) seq_len: (1)type:torch.Tensor (2)dtype:torch.int64, (3)shape:torch.Size([2]) target fields after batch(if batch size is 2): target: (1)type:torch.Tensor (2)dtype:torch.int64, (3)shape:torch.Size([2, 26]) seq_len: (1)type:torch.Tensor (2)dtype:torch.int64, (3)shape:torch.Size([2])

training epochs started 2020-12-07-18-47-38-732204 Epoch 1/100: 0%| | 0/8500 [00:00<?, ?it/s, loss:{0:<6.5f}]Read cache from caches/weibo_transformer_bmeso_True.pkl. In total 3 datasets: train has 1350 instances. dev has 270 instances. test has 270 instances. In total 3 vocabs: chars has 3356 entries. bigrams has 42184 entries. target has 29 entries.

Traceback (most recent call last): File "", line 1, in File "D:\Anaconda\envs\torch_env\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) File "D:\Anaconda\envs\torch_env\lib\multiprocessing\spawn.py", line 115, in _main self = reduction.pickle.load(from_parent) File "D:\Anaconda\envs\torch_env\lib\site-packages\fastNLP\core\batch.py", line 100, in getattr if hasattr(self.dataset, item): File "D:\Anaconda\envs\torch_env\lib\site-packages\fastNLP\core\batch.py", line 100, in getattr if hasattr(self.dataset, item): File "D:\Anaconda\envs\torch_env\lib\site-packages\fastNLP\core\batch.py", line 100, in getattr if hasattr(self.dataset, item): [Previous line repeated 662 more times] RecursionError: maximum recursion depth exceeded Read cache from caches/weibo_transformer_bmeso_True.pkl. In total 3 datasets: train has 1350 instances. dev has 270 instances. test has 270 instances. In total 3 vocabs: chars has 3356 entries. bigrams has 42184 entries. target has 29 entries.

Traceback (most recent call last): File "E:/BoXiao/TENER-master/train_tener_cn.py", line 130, in trainer.train(load_best_model=False) File "D:\Anaconda\envs\torch_env\lib\site-packages\fastNLP\core\trainer.py", line 618, in train raise e File "D:\Anaconda\envs\torch_env\lib\site-packages\fastNLP\core\trainer.py", line 611, in train self._train() File "D:\Anaconda\envs\torch_env\lib\site-packages\fastNLP\core\trainer.py", line 658, in _train for batch_x, batch_y in self.data_iterator: File "D:\Anaconda\envs\torch_env\lib\site-packages\fastNLP\core\batch.py", line 267, in iter for indices, batch_x, batch_y in self.dataiter: File "D:\Anaconda\envs\torch_env\lib\site-packages\torch\utils\data\dataloader.py", line 193, in iter return _DataLoaderIter(self) File "D:\Anaconda\envs\torch_env\lib\site-packages\torch\utils\data\dataloader.py", line 493, in init Traceback (most recent call last): File "", line 1, in self._put_indices() File "D:\Anaconda\envs\torch_env\lib\site-packages\torch\utils\data\dataloader.py", line 591, in _put_indices File "D:\Anaconda\envs\torch_env\lib\multiprocessing\spawn.py", line 105, in spawn_main indices = next(self.sample_iter, None) File "D:\Anaconda\envs\torch_env\lib\site-packages\torch\utils\data\sampler.py", line 172, in iter exitcode = _main(fd) File "D:\Anaconda\envs\torch_env\lib\multiprocessing\spawn.py", line 115, in _main self = reduction.pickle.load(from_parent) File "D:\Anaconda\envs\torch_env\lib\site-packages\fastNLP\core\batch.py", line 100, in getattr if hasattr(self.dataset, item): File "D:\Anaconda\envs\torch_env\lib\site-packages\fastNLP\core\batch.py", line 100, in getattr if hasattr(self.dataset, item): File "D:\Anaconda\envs\torch_env\lib\site-packages\fastNLP\core\batch.py", line 100, in getattr if hasattr(self.dataset, item): [Previous line repeated 662 more times] RecursionError: maximum recursion depth exceeded for idx in self.sampler: File "D:\Anaconda\envs\torch_env\lib\site-packages\fastNLP\core\batch.py", line 121, in iter return iter(self.sampler(self.dataset)) File "D:\Anaconda\envs\torch_env\lib\site-packages\fastNLP\core\sampler.py", line 79, in call seq_lens = data_set.get_all_fields()[self.seq_len_field_name].content File "D:\Anaconda\envs\torch_env\lib\site-packages\fastNLP\core\batch.py", line 101, in getattr return object.getattr(self.dataset, item) AttributeError: type object 'object' has no attribute 'getattr'

期待回复,万分感谢。

boxiaowave commented 3 years ago

找到这个issue 解决了,的确是num_workers的问题。