zjunlp / DeepKE

[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction
http://deepke.zjukg.cn/
MIT License
3.17k stars 656 forks source link

ner识别 few-shot run 报错 #545

Closed panhustar closed 1 day ago

panhustar commented 1 week ago

Describe the bug

A clear and concise description of what the bug is. Traceback (most recent call last): File "run.py", line 122, in main trainer.train() File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\deepke\name_entity_re\few_shot\module\train.py", line 71, in train loss = self._step(batch, mode="train") File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\deepke\name_entity_re\few_shot\module\train.py", line 175, in _step pred = self.model(src_tokens, tgt_tokens, src_seq_len, first) File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl return forward_call(*input, *kwargs) File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\deepke\name_entity_re\few_shot\models\model.py", line 410, in forward return self.prompt_model(src_tokens, tgt_tokens, src_seq_len, first) File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl return forward_call(input, **kwargs) File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\deepke\name_entity_re\few_shot\models\model.py", line 253, in forward prompt_state = self.generator(src_tokens, src_seq_len, first) File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\deepke\name_entity_re\few_shot\models\model.py", line 261, in generator past_key_values = self.get_prompt(batch_size) if self.use_prompt else None File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\deepke\name_entity_re\few_shot\models\model.py", line 270, in get_prompt input_tokens = self.prompt_inputs.unsqueeze(0).expand(batch_size, -1).to(self.device) File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\torch\cuda__init__.py", line 210, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled

Environment (please complete the following information):

Screenshots

If applicable, add screenshots to help explain your problem. few-shot.ymal 配置如下 seed: 1

bart_name: "C:\Support\work\model\bart-large" dataset_name: mit-movie device: cpu

num_epochs: 10 batch_size: 3 learning_rate: 5e-5 warmup_ratio: 0.01 eval_begin_epoch: 4 src_seq_ratio: 0.8 tgt_max_len: 10 num_beams: 1 length_penalty: 1

use_prompt: True prompt_len: 10 prompt_dim: 800

freeze_plm: True learn_weights: True save_path: "C:\Support\code\DeepKE\example\ner\few-shot\model" load_path: null notes: ''

Additional context

Add any other context about the problem here.

zxlzr commented 1 week ago

您好,您的报错显示AssertionError: Torch not compiled with CUDA enabled 建议您重新配置cuda环境再试试

panhustar commented 1 week ago

怎么配置,电脑是CPU的,没有GPU

panhustar commented 1 week ago

补充报错前的错误信息 bin C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\bitsandbytes\libbitsandbytes_cpu.so C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\bitsandbytes\cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. warn("The installed version of bitsandbytes was compiled without GPU support. " 'NoneType' object has no attribute 'cadam32bit_grad_fp32' CUDA SETUP: Loading binary C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\bitsandbytes\libbitsandbytes_cpu.so... argument of type 'WindowsPath' is not iterable

panhustar commented 1 week ago

File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\torch\cuda_init.py", line 210, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") 补充信息 我的torch版本 Name: torch Version: 1.11.0 Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration Home-page: https://pytorch.org/ Author: PyTorch Team Author-email: packages@pytorch.org License: BSD-3 Location: c:\programdata\anaconda3\envs\deepke\lib\site-packages Requires: typing-extensions Required-by: accelerate, deepke, peft

BeasterYong commented 6 days ago

您好,可以将DeepKE/example/ner/few-shot/conf/train/few_shot.yamlDeepKE/example/ner/few-shot/conf/predict.yaml 文件中的device修改为cpu后重试。另外,您的模型路径需要设置为双反斜杠。

panhustar commented 6 days ago

你好 以下是DeepKE\example\ner\few-shot\conf\train\few_shot.yaml 的配置 seed: 1

bart_name: "C://Support//work//model//bart-large" dataset_name: mit-movie device: cpu

num_epochs: 20 batch_size: 3 learning_rate: 5e-5 warmup_ratio: 0.01 eval_begin_epoch: 4 src_seq_ratio: 0.8 tgt_max_len: 10 num_beams: 1 length_penalty: 1

use_prompt: True prompt_len: 10 prompt_dim: 800

freeze_plm: True learn_weights: True save_path: "C://Support//code//DeepKE//example//ner//few-shot//model" load_path: null notes: ''

以下是DeepKE/example/ner/few-shot/conf/predict.yaml 文件的配置 cwd: ???

seed: 1

bart_name: "C://Support//work//model//bart-large" dataset_name: conll2003 device: cpu

num_epochs: 30 batch_size: 16 learning_rate: 2e-5 warmup_ratio: 0.01 eval_begin_epoch: 16 src_seq_ratio: 0.6 tgt_max_len: 10 num_beams: 1 length_penalty: 1

use_prompt: True prompt_len: 10 prompt_dim: 800

freeze_plm: True learn_weights: True notes: '' save_path: null # 模型保存路径 load_path: load_path # 模型加载路径,不能为空 write_path: "data/conll2003/predict.txt"

仍然报错 Traceback (most recent call last): File "run.py", line 122, in main trainer.train() File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\deepke\name_entity_re\few_shot\module\train.py", line 71, in train loss = self._step(batch, mode="train") File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\deepke\name_entity_re\few_shot\module\train.py", line 175, in _step pred = self.model(src_tokens, tgt_tokens, src_seq_len, first) File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl return forward_call(*input, *kwargs) File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\deepke\name_entity_re\few_shot\models\model.py", line 410, in forward return self.prompt_model(src_tokens, tgt_tokens, src_seq_len, first) File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl return forward_call(input, **kwargs) File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\deepke\name_entity_re\few_shot\models\model.py", line 253, in forward prompt_state = self.generator(src_tokens, src_seq_len, first) File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\deepke\name_entity_re\few_shot\models\model.py", line 261, in generator past_key_values = self.get_prompt(batch_size) if self.use_prompt else None File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\deepke\name_entity_re\few_shot\models\model.py", line 270, in get_prompt input_tokens = self.prompt_inputs.unsqueeze(0).expand(batch_size, -1).to(self.device) File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\torch\cuda__init__.py", line 210, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled 还请帮忙看看,为啥还是用了CUDA

BeasterYong commented 4 days ago

您好,想问下您是否安装了GPU版本的pytorch?这边需要您在安装时选择CPU版本的包。

panhustar commented 4 days ago

经查询torch版本如下 ,是cpu版本,为1.11.0+cpu (deepke) C:\WINDOWS\system32>python -c "import torch; print(torch.version)" 1.11.0+cpu

BeasterYong commented 4 days ago

您好,这样看应该还是配置的问题。您可以修改一下example/ner/few-shot/conf/train/conll.yaml中的device之后重试~

panhustar commented 4 days ago

还是报错信息如下 File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\deepke\name_entity_re\few_shot\models\model.py", line 270, in get_prompt input_tokens = self.prompt_inputs.unsqueeze(0).expand(batch_size, -1).to(self.device) File "C:\ProgramData\anaconda3\envs\deepke\lib\site-packages\torch\cuda__init__.py", line 210, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

已经把配置文件中涉及的yaml文件都改成cpu了 example/ner/few-shot/conf/train/conll.yaml
seed: 1

bart_name: "C://Support//work//model//bart-large" dataset_name: conll2003 device: cpu

num_epochs: 30 batch_size: 16 learning_rate: 2e-5 warmup_ratio: 0.01 eval_begin_epoch: 16 src_seq_ratio: 0.6 tgt_max_len: 10 num_beams: 1 length_penalty: 1

use_prompt: True prompt_len: 10 prompt_dim: 800

freeze_plm: True learn_weights: True save_path: save path # 模型保存路径 load_path: null notes: ''

命令调用的配置文件为 \example\ner\few-shot\conf\train\few_shot.yaml seed: 1

bart_name: "C://Support//work//model//bart-large" dataset_name: mit-movie device: cpu

num_epochs: 20 batch_size: 3 learning_rate: 5e-5 warmup_ratio: 0.01 eval_begin_epoch: 4 src_seq_ratio: 0.8 tgt_max_len: 10 num_beams: 1 length_penalty: 1

use_prompt: True prompt_len: 10 prompt_dim: 800

freeze_plm: True learn_weights: True save_path: "C://Support//code//DeepKE//example//ner//few-shot//model" load_path: null notes: ''

同时补充 \anaconda3\envs\deepke\Lib\site-packages\torch文件下有CUDA和CPU 但是调用了\torch\cuda__init_.py 方法,没有调用 \torch\cpu_init.py方法

torch 版本为 1.11.0+cpu python版本为 3.8 Windows环境

panhustar commented 1 day ago

非常感谢,问题已经解决,重新安装了deepke到site-packages 就可以了

zxlzr commented 1 day ago

好的,您有什么其他问题随时提问、