Closed MrZilinXiao closed 1 year ago
@MrZilinXiao I am sorry for the delayed reply! I am working on this in this branch. This branch is work-in-progress, so I will notify you when I complete the work.
I have completed the work and the pretraining instruction is available here.
Hi @ikuyamada. Thanks for your great contribution. I have seen your commits and want to check the following with you:
By comparing the commit you made for README https://github.com/studio-ousia/luke/commit/c346f2656c2f5084e773603274c4c17b2fcfdb28 and pertaining procedures for luke (https://github.com/studio-ousia/luke/blob/master/pretraining.md), is the only difference between them the hyperparameters (epochs, etc.) and create_candidate_data.py
file that only include candidates entities in the entity vocab instead of luke containing 500k most common entities? If there is something I missed, please let me know :)
Hi @MrZilinXiao, Thanks for your reply!
In addition to the difference of the entity vocabulary, the main differences are as follows:
--masked-lm-prob
to 0.0Great, thanks for pointing that out. Wish you success in your future research career.
Hi @ikuyamada. Created a PR for some mistake in the instructions: https://github.com/studio-ousia/luke/pull/164.
Hi @ikuyamada. Sorry to disturb you again since we are following your work. And your experience could relieve some burdens :)
Hi @MrZilinXiao,
Thank you for your continued interest in LUKE.
The GlobalED paper mentioned decomposing entity embedding into two smaller matrices multiplied.
Unlike the LUKE model, we do not decompose entity embeddings in our entity disambiguation model. I think our entity disambiguation paper does not mention the decomposition of entity embeddings.
Hi @ikuyamada. Thanks for your great work. Would you mind providing the pretraining scripts or procedures to train the checkpoints you guys provide here? Relevant issue is: https://github.com/studio-ousia/luke/issues/126