Closed zycdev closed 5 years ago
Hi, thank you for pointing out it. I have conducted some minor modification in the pre-processing scripts after generating the examples, but I do not think it is a main reason (maybe I am wrong). In my experiments, learning_rate, batch_size and early stop strategies(if you add) and some other parameters can affect the results up to 10%. Maybe you can try to delete the linear_warm_up in task #2(I realize that after finishing the paper)?
Hi, @Sleepychord , very thank you for your reply! I retrained BERT for 1 epoch and then BERT & GNN for 1 epoch with hyperparameters as you showed in the paper, but I still can't reproduce the result of the paper on dev set.
My training commands:
export CUDA_VISIBLE_DEVICES=0,1,2,3 # 4 K80(12GB memory) GPUs
python train.py --batch-size=10 --lr1=1e-4
python train.py --load=True --mode='bundle' --batch-size=10 --lr1=4e-5 --lr2=1e-4 # haven't delete the linear_warm_up yet
and my evaluation result on dev set:
{'em': 0.2598244429439568, 'f1': 0.35564370767865855, 'prec': 0.37582762612134724, 'recall': 0.35888658012669966, 'sp_em': 0.07562457798784605, 'sp_f1': 0.3665706092242228, 'sp_prec': 0.4997955049676863, 'sp_recall': 0.3207705540014783, 'joint_em': 0.03349088453747468, 'joint_f1': 0.19135653981707093, 'joint_prec': 0.2720478977096129, 'joint_recall': 0.17037639264369026}
Could you provide more details about hyperparameters, training strategies of your best experimental result? I am looking forward to your advice.
Thanks!
Hi @zycdev ,
I'm not sure what's the problem you encountered but I've successfully got reasonable results with the scripts you provided. I also made an improved version of CogQA here, which is be much faster and far less resource-demanding for task 2, with slightly better results. You can try that out.
Hope this helps!
@zycdev , I think that tuning the learning_rate in task #2 is effective. Thank @qibinc for improvement and maybe you can follow it.
@qibinc @Sleepychord Thank you very much for your work, I am glad to try the new version!
Hi @zycdev ,
Here is an example for running the new version:
For task 1, run:
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --batch-size 16 --expname test --weight-decay 0.01
For task 2, run:
CUDA_VISIBLE_DEVICES=0 python train.py --load --load-path saved/bert-base-uncased-test.bin --mode '#2' --lr1 2e-5 --gradient-accumulation-steps 8 --expname test --tune
(that's right, now we only need one GPU for task 2)
For inference, run:
CUDA_VISIBLE_DEVICES=0 python infer.py --data-file data/hotpot_dev_fullwiki_v1_merge.json --model-file saved/bert-base-uncased-test.bin
Evaluation:
python scripts/hotpot_evaluate_v1.py data/hotpot_dev_fullwiki_v1_merge_pred.json data/hotpot_dev_fullwiki_v1_merge.json
Hi @qibinc , I am grateful for the guideline, just wanting to ask for this :-D
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --batch-size 16 --expname test --weight-decay 0.01
老哥你好,我看你的代码中数据用的是"hotpot_train_v1.1_refined.json",这个refined是因为对数据做了什么改变吗?
@ditingdapeng refined是预处理过的,每个QA pair增加了两个域表示模糊匹配等算法抽取出来的真实cognitive graph的节点,对数据本身没有别的改动。
请问这个refined用的什么方法处理?和原论文中处理的一样吗?多谢大佬回复~! On 11/3/2020 22:47,Sleepy_chordnotifications@github.com wrote:
@ditingdapeng refined是预处理过的,每个QA pair增加了两个域表示模糊匹配等算法抽取出来的真实cognitive graph的节点,对数据本身没有别的改动。
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
Hi,
I found it may cause a self-cycle in the following snippet. https://github.com/THUDM/CogQA/blob/217f0f12819c86413d315abf9d818da05c41cb9d/process_train.py#L91-L93
For example, after running
process_train.py
, I got a JSON object like this:However, I think it should be like what showed in your examples:
Could you explain what this snippet works for? By the way, I got a reproduction result which is about 10% lower than the result in the paper on dev set with 2 K80 GPUs, do you think this snippet is a reason of low result?
Thank you!