example中NER任务 - Githubissues

dcosmice commented 5 months ago

win环境下 run_bert时总会出现 Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace. (venv) PS C:\Users\Ylab\DeepKE\example\ner\standard> python run_bert.py C:\Users\Ylab\DeepKE\venv\lib\site-packages\hydra\plugins\config_source.py:190: UserWarning: Missing @package directive hydra/output/custom.yaml in file://C:\Users\Ylab\DeepKE\example\ner\standard\conf. See https://hydra.cc/docs/next/upgrades/0.11_to_1.0/adding_a_package_directive warnings.warn(message=msg, category=UserWarning) C:\Users\Ylab\DeepKE\venv\lib\site-packages\hydra\plugins\config_source.py:190: UserWarning: Missing @package directive train.yaml in file://C:\Users\Ylab\DeepKE\example\ner\standard\conf. See https://hydra.cc/docs/next/upgrades/0.11_to_1.0/adding_a_package_directive warnings.warn(message=msg, category=UserWarning) C:\Users\Ylab\DeepKE\venv\lib\site-packages\hydra\plugins\config_source.py:190: UserWarning: Missing @package directive predict.yaml in file://C:\Users\Ylab\DeepKE\example\ner\standard\conf. See https://hydra.cc/docs/next/upgrades/0.11_to_1.0/adding_a_package_directive warnings.warn(message=msg, category=UserWarning) C:\Users\Ylab\DeepKE\venv\lib\site-packages\hydra\plugins\config_source.py:190: UserWarning: Missing @package directive hydra/model/bert.yaml in file://C:\Users\Ylab\DeepKE\example\ner\standard\conf. See https://hydra.cc/docs/next/upgrades/0.11_to_1.0/adding_a_package_directive warnings.warn(message=msg, category=UserWarning) Traceback (most recent call last): File "run_bert.py", line 109, in main tokenizer = BertTokenizer.from_pretrained(cfg.bert_model, do_lower_case=cfg.do_lower_case) File "C:\Users\Ylab\DeepKE\venv\lib\site-packages\transformers\tokenization_utils_base.py", line 1763, in from_pretrained File "C:\Users\Ylab\DeepKE\venv\lib\site-packages\transformers\utils\hub.py", line 409, in cached_file resolved_file = hf_hub_download( File "C:\Users\Ylab\DeepKE\venv\lib\site-packages\huggingface_hub\utils_validators.py", line 124, in _inner_fn return fn(*args, **kwargs) File "C:\Users\Ylab\DeepKE\venv\lib\site-packages\huggingface_hub\file_download.py", line 1148, in hf_hub_download with open(ref_path) as f: PermissionError: [Errno 13] Permission denied: 'C:\Users\Ylab/.cache\huggingface\hub\models--bert-base-chinese\refs\main'

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

为什么其中会显示一个单斜杠./

请问怎么解决

xxupiano commented 5 months ago

请问您是自动下载BERT还是BERT下载到本地后手动填写模型路径？

dcosmice commented 5 months ago

请问您是自动下载BERT还是BERT下载到本地后手动填写模型路径？

手动填写路径您提到的，模型路径修改是在哪个文件夹里？目前已经跑通RE，但是到了NER任务时出现问题

xxupiano commented 5 months ago

https://github.com/zjunlp/DeepKE/blob/main/example/ner/standard/conf/hydra/model/bert.yaml#L2 WIN的路径中请使用\

dcosmice commented 5 months ago

https://github.com/zjunlp/DeepKE/blob/main/example/ner/standard/conf/hydra/model/bert.yaml#L2 WIN的路径中请使用\

好的，谢谢，已改正

zxlzr commented 5 months ago

请问您的问题是否已解决？

dcosmice commented 5 months ago

请问您的问题是否已解决？

暂未您好，运行predict.py后，仅出现如下结果：屏幕截图 2024-05-10 091355 那么请教问题： 1、这些标签分别是什么意思，在哪个文件可以看到这些标签的注释（如，已知per为人物，B-per、I-PER等等分别指什么） 2、这个任务的输出和RE任务的输出貌似不一样，那么如何继续实现ner任务：当输入一个样本句子，出来人物，地点和机构（如输入：本报北京9月4日讯记者杨涌报道：部分省区人民日报宣传发行工作座谈会9月3日在4日在京举行，输出：Person-杨勇等类似您展示的案例），期待您的回复！

xxupiano commented 5 months ago

这是NER的BIO标签，B-表示entity mention的开始token，I-表示entity mention的后续token；
修改https://github.com/zjunlp/DeepKE/blob/main/example/ner/standard/conf/predict.yaml#L1，即可指定想要实现NER的上下文。

dcosmice commented 5 months ago

请问您的问题是否已解决？

这是NER的BIO标签，B-表示entity mention的开始token，I-表示entity mention的后续token；

修改https://github.com/zjunlp/DeepKE/blob/main/example/ner/standard/conf/predict.yaml#L1，即可指定想要实现NER的上下文。

好的，谢谢，还有一个问题，是否有改良版本的代码，直接输出:( person：杨涌)，而非一个一个字；试了下被注释掉的代码，貌似不太行：

xxupiano commented 5 months ago

https://github.com/zjunlp/DeepKE/blob/main/src/deepke/name_entity_re/standard/models/InferBert.py#L167-L189 这里有参考的代码，可以自行按需修改

dcosmice commented 5 months ago

https://github.com/zjunlp/DeepKE/blob/main/src/deepke/name_entity_re/standard/models/InferBert.py#L167-L189 这里有参考的代码，可以自行按需修改

谢谢，发现问了很多stupid问题，再问一个：如果想要修改数据集重新训练任务，标注文本和设置新标签，请问有什么建议，有哪些基本步骤或者行动框架，（除了三个数据集需要替换，还有需要替换的文本或者词典吗，暂不考虑bug问题）

zjunlp / DeepKE

example中NER任务 #497