Closed HarmanDotpy closed 2 years ago
Hi Harman,
Thanks for your interest! Yes, currently no retrieval happens during QA. We haven't tried it yet either.
Like QA-GNN etc. a retrieve-and-read approach can probably be applied here as well. My intuition says it should give good results esp. in the full KG setting (where you don't need to predict missing edges). The retriever could be just a neighbourhood fact retriever or something learned as well. Input to the model could be "\<fact 1>... <fact k> \<question>".
In such a setting, one may also want to modify pretraining so as to align it with the retrieve and read approach. Instead of pretraining with just "subject| relation" as input, you might want to give "\<fact1>..\<fact k> subject| relation" as input. Something like this would be needed if we want similar input lengths during pretraining and finetuning.
If you are planning on implementing something or would like to discuss more, please reach out to me over email and I would be happy to discuss in detail.
Hey, sorry for the late reply. But your reply was really helpful, I was reviewing some methods of KG+LM QA, and what you said makes a lot of sense to be the next step for improving accuracy. I would definitely like to implement it, maybe after a few days if not immediately.
Would get in touch over email to discuss more. (closing the issue for now)
Thanks!
Hi Apoorv,
It was great to read the KGT5 paper. If I understand correctly, during the fine-tuning phase, on the QA dataset, we do not use any (retrieved) knowledge graph (subgraph) i.e. we just use T5 model for answering the question. Other works such as QA-GNN/GreaseLM etc use a retrieved knowledge graph along with a language model for reasoning using the 2 modalities and answering the question. Do you think we can do something similar with KGT5? or have you tried doing it in any of your experiments?
Thanks