Reproduction Issues of the ATLOP Model on JacRED.

YoumiMa / JacRED

Repository for Japanese Document-level Relation Extraction Dataset (plan to be released in March).

5 stars 0 forks source link

Dear Youmi Ma, An Wang, Naoaki Okazaki,

I am a student from Xinjiang University, China, specializing in document-level relation extraction in natural language processing. Recently, I came across your work "Building a Japanese Document-Level Relation Extraction Dataset Assisted by Cross-Lingual Transfer," and I must say, your paper and the work on the dataset are impressive. Thank you for your contribution!

However, I encountered some issues while attempting to reproduce the results of your paper using the ATLOP model (Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling) on the JacRED dataset. The results of my reproduction are as follows: {'dev_F1': 17.578315208570036, 'dev_F1_ign': 16.4534137302226}. I am unsure where the problem lies, and I would like to understand how you used the ATLOP model to achieve the results on the JacRED dataset. What settings did you use? If possible, could you please provide me with your environment and code settings for reproducing the results of your paper?

Regardless of whether you can provide assistance, I extend my best wishes and thanks to you~~！I would be grateful for any help you can provide!

Best regards,

Xing Liu Xinjiang University

Hi @liuxinmaping666 , thank you for your interest in this project! Really sorry for the late reply. I just returned from a business trip. The following is my script used for ATLOP.

TYPE=$1
SEED=$2

NAME=${TYPE}_${SEED}

python train.py --data_dir dataset/JacRED \
--transformer_type bert \
--display_name ${TYPE} \
--model_name_or_path  tohoku-nlp/bert-base-japanese-v3 \
--train_file train.json \
--dev_file dev.json \
--test_file test.json \
--save_path ${NAME}.ckpt \
--train_batch_size 4 \
--test_batch_size 8 \
--gradient_accumulation_steps 1 \
--num_labels 4 \
--learning_rate 5e-5 \
--max_grad_norm 1.0 \
--warmup_ratio 0.06 \
--num_train_epochs 30.0 \
--seed ${SEED} \
--num_class 36

In addition, please make sure that,

the dataset JacRED is placed under {atlop}/dataset/, where {atlop} is the directory where codes of atlop is placed;
the 4th row of prepro.py, i.e., docred_rel2id, reads from JacRED/meta/rel2id.json;
the 6th row of evaluation.py, i.e., rel2id, reads from JacRED/meta/rel2id.json.

I hope this solves your problem. Please let me know if you have further questions.

YoumiMa / JacRED

Reproduction Issues of the ATLOP Model on JacRED. #1