Provide LLM alternative for Relationship/Meaning Extraction

dominik-pichler commented 10 months ago

Instead of spliting the nodes with a LLM, other possibilites (syntactic dissemblement etc.) should be investigated

dominik-pichler commented 10 months ago

maybe libraries like SpaCy, NLTK or Stanford NLP help

dominik-pichler commented 10 months ago

Steps:

1) Coreference Resolution:

Pronounce will be replaced by reference entities.

2) Named Entity Recognition

Identify/Recognize all mentioned entities in the text

Solve Entity Disambiguation & Entity Linking Elon's vs Elon -> is the same but will not be recognized as the same thing

Co-Occurrence Graphs Evaluate the co-occurence of terms in a subtask

3) Relationship Extraction

This could happen in multiple ways:

Rule Based with grammatical dependencies/rules (Resource Description Framework)
NLP Model (just as we did with the LLM right?), in Addition TACRED or Wiki80 could be helpful.

found @ https://neo4j.com/blog/text-to-knowledge-graph-information-extraction-pipeline/

dominik-pichler commented 10 months ago

For 1): https://towardsdatascience.com/coreference-resolution-in-python-aca946541dec#:~:text=Coreference%20resolution%20is%20the%20NLP,to%20the%20same%20underlying%20entities.

dominik-pichler commented 10 months ago

Had a problem with coreference lib neuralcoref due to version conflicts. Due to time constraints, I'll take care of this last

dominik-pichler commented 10 months ago

2) Where is the meaning of the sentences hidden? Or is it enough to ask where the ideas are hidden? Actually, I would first need a good working definition of ideas. In this context, ideas are to be understood as concepts that a subject creates in order to make decisions for action. In other words, a premise on the basis of which the world is interpreted. This interpretation is then the foundation on which action decisions are made

In this case, it is a matter of extracting acting relationships. In particular, relationships from (subject-verb-subject)^n. Example: X greets Z)

To be separated from sentient relations ( X feels y about Z) which should come in the next step!

Problem: How do you depict actions spanning across multiple sentences ?

Resources here: https://towardsdatascience.com/named-entity-recognition-with-nltk-and-spacy-8c4a7d88e7da

dominik-pichler commented 10 months ago

Question: In the english language: Is the gramar structured in order? Meaning, give the following Wordtpyes: S1 - a1 - S2 - a2 - S3, could a1 also effect S3 or is it always only affecting S2.

Meaning, what are our chunk patterns ?