wzzzd / text_classifier_pytorch

基于Pytorch的文本分类框架,支持TextCNN、Bert、Electra等。
58 stars 10 forks source link

pip install scikit_learn==1.0.2 找不到 #4

Open 10o0o01 opened 2 years ago

10o0o01 commented 2 years ago

安装scikit_learn 1.0.2 需要Python>=3.7 3.6 装不上

10o0o01 commented 2 years ago

使用非预训练类时IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

wzzzd commented 2 years ago

使用非预训练类时IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

具体是哪个文件哪一行代码?

10o0o01 commented 2 years ago

感谢大佬回复,我又重新试了一遍代码。配置方面只有python用了3.7的区别。 在用基础模型时,我只修改了 model_name='TextRCNN' 里的模型名 会报错: Traceback (most recent call last): File "main.py", line 41, in trainer.train() File "/data/yll/Text/TextEnv1/module/Trainer.py", line 205, in train loss = self.step(batch) File "/data/yll/Text/TextEnv1/module/Trainer.py", line 230, in step output, hidden_emb = outputs ValueError: too many values to unpack (expected 2)

另外,在用预训练模型时,我先试了Bert和Distilbert,都特别好用,但是用Albert时 model_name='Albert' initial_pretrain_model = 'voidful/albert_chinese_tiny' initial_pretrain_tokenizer = 'voidful/albert_chinese_tiny'
会出现: Traceback (most recent call last): File "main.py", line 41, in trainer.train() File "/data/yll/Text/TextEnv1/module/Trainer.py", line 205, in train loss = self.step(batch) File "/data/yll/Text/TextEnv1/module/Trainer.py", line 229, in step outputs = self.model(*batch) File "/data/yll/miniconda3/envs/TextEnv1/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, **kwargs) File "/data/yll/Text/TextEnv1/module/models/Albert.py", line 20, in forward return [output, output.pooler_output] AttributeError: 'Tensor' object has no attribute 'pooler_output

感谢大佬能回复我,另外,大佬能推荐几个适配的英文文本分类预训练模型吗,真的非常感谢!

wzzzd commented 2 years ago

感谢大佬回复,我又重新试了一遍代码。配置方面只有python用了3.7的区别。 在用基础模型时,我只修改了 model_name='TextRCNN' 里的模型名 会报错: Traceback (most recent call last): File "main.py", line 41, in trainer.train() File "/data/yll/Text/TextEnv1/module/Trainer.py", line 205, in train loss = self.step(batch) File "/data/yll/Text/TextEnv1/module/Trainer.py", line 230, in step output, hidden_emb = outputs ValueError: too many values to unpack (expected 2)

另外,在用预训练模型时,我先试了Bert和Distilbert,都特别好用,但是用Albert时 model_name='Albert' initial_pretrain_model = 'voidful/albert_chinese_tiny' initial_pretrain_tokenizer = 'voidful/albert_chinese_tiny' 会出现: Traceback (most recent call last): File "main.py", line 41, in trainer.train() File "/data/yll/Text/TextEnv1/module/Trainer.py", line 205, in train loss = self.step(batch) File "/data/yll/Text/TextEnv1/module/Trainer.py", line 229, in step outputs = self.model(*batch) File "/data/yll/miniconda3/envs/TextEnv1/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, **kwargs) File "/data/yll/Text/TextEnv1/module/models/Albert.py", line 20, in forward return [output, output.pooler_output] AttributeError: 'Tensor' object has no attribute 'pooler_output

感谢大佬能回复我,另外,大佬能推荐几个适配的英文文本分类预训练模型吗,真的非常感谢!

感谢反馈,相关错误都已经修复了,可以重新pull一下代码。

关于适配的英文文本分类预训练模型,具体可以参考huggingface的模型库 (https://huggingface.co/models?sort=downloads&search=albert), 基本上每种模型都能找到对应的开源参数。

: )