tanyuqian / knowledge-harvest-from-lms

ACL 2023 (Findings) - BertNet: Harvesting Knowledge Graphs from Pretrained Language Models
https://lmnet.io
95 stars 13 forks source link

Using Generative Models #5

Open phiwi opened 10 months ago

phiwi commented 10 months ago

Also from my side big thanks for sharing your work!

In your publication you state that (with minor amendments) one could also use generative models to extract KG tuples. Is this already available somewhere in you implementation? And if not: Would you mind providing a rough sketch how one should do it?

xiyan128 commented 10 months ago

Hi, thanks for the question. To answer your first question -- we should note that the implementation in this repo is only for BERT like language models.

To reiterate the problem BertNet posed for autoregressive LM: the objective is to extract a set of valid KG-style tuples from an autoregressive language model, subject to certain constraints.

Problem Statement

Sketch

Since I've been away from this problem for a while, I'm OK sharing this intuitive/high level sketch I had earlier.

1. Approximating the Search Space: Sampled Beam Search

2. Structural Compliance: KG-Style Tuple with Constrained Search

3. Probabilistic Integrity: LM's Faith


This is the sketch we experimented before, and the results are reasonably good although we are lacking more rigorous evaluation because of the unique nature of the problem setting. Let me know if you are interested in the implementation details of this sketch!

phiwi commented 10 months ago

Hey Xiyan,

thank you very much for you extensive answer. I didn't expect such an in-depth theoretic explication. Now, from my understanding and conforming my high-level assumptions a combination of prompting and sampling with (constrained) beam search might lead to the desired results (similar triples as for the MLM approach).

If you don't mind sharing your peliminary implementations I would greatly be happy to be inspired by your ideas.