复现达不到原文效果，请教一下大家复现成功的技巧

chendierong commented 3 years ago

在weibo和resume数据集上按作者提供的代码，没有调参的情况下进行复现。使用的词表为YJ词表。 resume数据集上的F值为94.45%，加入bert后F值95.73%， weibo数据集上对应的效果分别为59.71%和66.99%，均达不到原文效果。有复现成功的同学，方便请教一下如果调参的话主要是调整哪些参数吗？另外，数据集是否需要额外的一些处理呢？目前使用的数据集来源： resume数据集https://github.com/jiesutd/LatticeLSTM weibo数据集（2nd）https://github.com/hltcoe/golden-horse/tree/master/data

EeyoreLee commented 3 years ago

在weibo和resume数据集上按作者提供的代码，没有调参的情况下进行复现。使用的词表为YJ词表。 resume数据集上的F值为94.45%，加入bert后F值95.73%， weibo数据集上对应的效果分别为59.71%和66.99%，均达不到原文效果。有复现成功的同学，方便请教一下如果调参的话主要是调整哪些参数吗？另外，数据集是否需要额外的一些处理呢？目前使用的数据集来源： resume数据集https://github.com/jiesutd/LatticeLSTM weibo数据集（2nd）https://github.com/hltcoe/golden-horse/tree/master/data

weibo的数据我以f=0.685573, pre=0.707379, rec=0.665072复现了。

EeyoreLee commented 3 years ago

https://github.com/ear-lee/EarleeNLP/commit/5435e120f8ff13a08b6182ee5e7b750ac030e10d

我用的这个参数，应该就是默认的。run_chinese_ner 直接跑就能复现

rongjunlee commented 3 years ago

在weibo和resume数据集上按作者提供的代码，没有调参的情况下进行复现。使用的词表为YJ词表。 resume数据集上的F值为94.45%，加入bert后F值95.73%， weibo数据集上对应的效果分别为59.71%和66.99%，均达不到原文效果。有复现成功的同学，方便请教一下如果调参的话主要是调整哪些参数吗？另外，数据集是否需要额外的一些处理呢？目前使用的数据集来源： resume数据集https://github.com/jiesutd/LatticeLSTM weibo数据集（2nd）https://github.com/hltcoe/golden-horse/tree/master/data

我跑的结果和你类似，请问现在能复现了么？

wing7171 commented 2 years ago

在weibo和resume数据集上按作者提供的代码，没有调参的情况下进行复现。使用的词表为YJ词表。 resume数据集上的F值为94.45%，加入bert后F值95.73%， weibo数据集上对应的效果分别为59.71%和66.99%，均达不到原文效果。有复现成功的同学，方便请教一下如果调参的话主要是调整哪些参数吗？另外，数据集是否需要额外的一些处理呢？目前使用的数据集来源： resume数据集https://github.com/jiesutd/LatticeLSTM weibo数据集（2nd）https://github.com/hltcoe/golden-horse/tree/master/data

weibo的数据我以f=0.685573, pre=0.707379, rec=0.665072复现了。

这个结果比论文给出的指标要高，您觉得是什么原因呢正常嘛（我这边yj lexicon f1达到0.63 也比论文中高一点不知道为什么）

fengy-l commented 2 years ago

在微博和resume数据集上按作者提供的代码，没有调参的情况下进行复现。使用的词表为YJ词。 resume数据集上的F值为94.45%，加入bert后F值95.73% ，微博数据集上的效果分别为59.71%和66.99%，均达不到效果。有复现成功的同学，方便请教一下调参的话主要是哪些参数吗？另外数据集是否需要目前使用的数据集来源：简历数据集https://github.com/jiesutd/LatticeLSTM 微博数据集（第二期）https://github.com/hltcoe/golden-horse/tree/master /数据

微博的数据我以f=0.685573, pre=0.707379, rec=0.665072复现了。

这个比论文为什么给出的指标要高，觉得是什么原因呢？

是不是看的dev的结果，结果不应该看最高dev下的参数去测试测试集吗?

fengy-l commented 2 years ago

请问一下，有人复现成功了吗？

C929-x commented 1 year ago

Traceback (most recent call last): File "flat_main.py", line 299, in datasets,vocabs,embeddings = equip_chinese_ner_with_lexicon(datasets,vocabs,embeddings, File "D:\Anaconda3\lib\site-packages\fastNLP\core\utils.py", line 160, in wrapper with open(cache_filepath, 'wb') as f: FileNotFoundError: [Errno 2] No such file or directory: 'cache\weibo_lattice_only_train/False_trainClip/True_norm_num/0char_min_freq1bigram_min_freq1word_min_freq1only_train_min_freqTruenumber_norm0lexicon_yjload_dataset_seed100'

请问你们有遇到这个问题吗

zhangliang-chn commented 11 months ago

楼主解决了吗？我现在也遇到了这个问题

EeyoreLee commented 11 months ago

@fengy-l @wing7171 - hi, 之前没看到回复，并没有比论文高，论文给出的68.55%，我复现的68.5573%，只是精度舍入的问题

zhangliang-chn commented 11 months ago

各位大佬，我在resume数据集上复现，发现 f1 分数为94.9%，并没有达到论文中的95.45%，请问这是正常的吗？还是说需要调参呀？

LeeSureman / Flat-Lattice-Transformer

复现达不到原文效果，请教一下大家复现成功的技巧 #68