新数据集的训练 - Githubissues

xxllp commented 2 years ago

Agreement

[x] Fill the space in brackets with x to check the agreement items.
[ ] Before submitting this issue, I've fully checked the instructions in README.md.
[ ] Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
[ ] This issue is about the toolkit itself, not Python, pip or other programming basics.
[ ] I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

在自己新数据的训练数据处理这块如何入手有无具体的步骤指引

Environment

Environment	Values
System	Windows/Linux
GPU Device
CUDA Version
Python Version
PyTorch Version
dee (the Toolkit) Version

Spico197 commented 2 years ago

您可在现有数据集的基础上单步调试一下，参考一下每个模块的功能。也可以参考这个issue https://github.com/Spico197/DocEE/issues/41 的讨论。

xxllp commented 2 years ago

还有个问题：这个代码里面单个事件的某些role 是否可用支持多个metion。因为有些role 实体存在连续的这种

Spico197 commented 2 years ago

可以参考这个issue的讨论：https://github.com/Spico197/DocEE/issues/38#issuecomment-1176207177

xxllp commented 2 years ago

very thx ，数据可用跑起来了这块对单个文件直接预测这块是已经具备了吗

Spico197 commented 2 years ago

inference.py 文件中提供了预测单个instance的例子。如果是预测一个文件的话，建议手写下 batch 化的预测，可以快一点。

xxllp commented 2 years ago

看模型评估的时候有对比gold_span 和predict_span 结果前者是ner的gt 是吧
我这边数据 predict 里面的 role F1 跟 gold_span 都差距10多个百分点

xxllp commented 2 years ago

而且我现在在test 数据集上面的指标都是0 而dev上面是正常的这个是啥原因导致的。dev test两个文件的格式目前是完全一样的

Spico197 commented 2 years ago

看模型评估的时候有对比gold_span 和predict_span 结果前者是ner的gt 是吧我这边数据 predict 里面的 role F1 跟 gold_span 都差距10多个百分点

什么是“ner的gt”？没太明白。。。您指的role F1是什么？

Spico197 commented 2 years ago

而且我现在在test 数据集上面的指标都是0 而dev上面是正常的这个是啥原因导致的。dev test两个文件的格式目前是完全一样的

不清楚，需要再检查检查

xxllp commented 2 years ago

看模型评估的时候有对比gold_span 和predict_span 结果前者是ner的gt 是吧我这边数据 predict 里面的 role F1 跟 gold_span 都差距10多个百分点

什么是“ner的gt”？没太明白。。。您指的role F1是什么？

这个输出目录中有 dee_eval.dev.gold_span.TriggerAwarePrunedCompleteGraph.json 这种命名这个gold_span 应该就是用的gold ner 是吧，role F1 就是这个json 里面 overall-overall 里面的 MacroF1 ，就是所有role的F1 .

xxllp commented 2 years ago

大佬 predict_one 返回的json 里面 comments 和event_list 是啥关系为啥event_list 的论元少于在comments中的数量

Spico197 commented 2 years ago

看模型评估的时候有对比gold_span 和predict_span 结果前者是ner的gt 是吧我这边数据 predict 里面的 role F1 跟 gold_span 都差距10多个百分点

什么是“ner的gt”？没太明白。。。您指的role F1是什么？

这个输出目录中有 dee_eval.dev.gold_span.TriggerAwarePrunedCompleteGraph.json 这种命名这个gold_span 应该就是用的gold ner 是吧，role F1 就是这个json 里面 overall-overall 里面的 MacroF1 ，就是所有role的F1 .

嗯啊是的，gold_span 是指预测结果时使用金标实体。后面您说的 role F1 我们称之为 overall F1 结果，因为首先要确保类别相同。NER 部分在篇章事件抽取任务中很重要，所以金标 NER 的 overall F1 会高很多。

Spico197 commented 2 years ago

大佬 predict_one 返回的json 里面 comments 和event_list 是啥关系为啥event_list 的论元少于在comments中的数量

因为并不是每个实体都是参与事件的论元

xxllp commented 2 years ago

但是我看了下应该是缺少的居多，没参与的还是不多的，奇怪

Spico197 commented 2 years ago

但是我看了下应该是缺少的居多，没参与的还是不多的，奇怪

确实很奇怪，可能是潜在的bug。您是在自己的数据集上训练的吗？repo中公开的模型里有没有发现这个问题？我看看能不能复现一下

xxllp commented 2 years ago

是自己的数据集公开的这块我没细看因为我看ner span 的F1都有了0.90+ 但是最终事件的role F1 却只有0.82 这个明显差了不少。可能就是过了那个连接图一些实体间的连接都是0 才少了

xxllp commented 2 years ago

有个新问题 PTPCG 模型 train 多卡训练是否哪里需要改下直接使用 scripts/train_multi.sh 带起来的话只有一张卡实际在跑

Spico197 commented 2 years ago

有个新问题 PTPCG 模型 train 多卡训练是否哪里需要改下直接使用 scripts/train_multi.sh 带起来的话只有一张卡实际在跑

可以参考Doc2EDAG脚本的启动方法，加入--parallel_decorateflag。鉴于目前讨论事项与本issue无关，先将这个issue关闭了，其它问题欢迎新开issue。

Spico197 / DocEE

新数据集的训练 #46

Agreement

Problem

Environment