Component renamings - Githubissues

robknapen commented 5 months ago

Based on progression of ideas and development I would like to propose the renaming of two components:

“Interlinker” to become “Metadata Augmentation” (or “Metadata Enrichment”)
“Large Language Model” to become “Natural language Querying (NLQ)”

The Metadata Augmentation component can then also include NLP/LLM based functionality for improving metadata, e.g. extracting additional keywords, ontology terms, summarising text, etc. While the NLQ component can be focused on interfacing between the ChatBot UI and the knowledge graph.

Implementation-wise the components might share vector stores with embeddings created from relevant documents or KGs, and access to an instance of a large language model. But they might be different depending on purpose and requirements.

DajanaSnopkova commented 5 months ago

It's sort of last-minute change, but I can manage to do that in the architecture diagram. Can you take care of renaming in the Technical Documentation + providing the description of functionalities, that you mentioned?

robknapen commented 5 months ago

Sure, if the change makes sense and is acceptable I can update it in the Technical Documentation.

DajanaSnopkova commented 5 months ago

If as you say, it is based on the development team discussions it makes sense to me :) But please make sure to finish the changes tomorrow at 12:00.

robknapen commented 5 months ago

I don't think I can make that deadline, probably best to do it in a next iteration then.

DajanaSnopkova commented 5 months ago

Ok :) I am leaving it for the next iteration. We can also open this issue on Thursday during Sprint Refinement meeting.

robknapen commented 5 months ago

I'll assign it to @roblokers so he can bring it up for the backlog refinement.

DajanaSnopkova commented 2 months ago

Please, now it is time to work on this (deadline mid-september). We will deliver updated version of Technical documentation together with the first prototype.

Currently, we have Interlinker and Metadata augmentation tech component with the following expected functionality. (Note that some of the functionality is expected in next iterations). I think we could keep them separated...

Metadata augmentation

Keywords matcher
Translation module
Spatial Locator
Spatial scope analyser – regional / national datasets
EUSO High-Value datasets tagging

Interlinker

Automatic metadata interlinking
Metadata cleaning
Duplicates identification
Link liveliness assessment
Similarity finder

roblokers commented 2 months ago

I agree with renaming LLM to Natural language Querying (NLQ).

roblokers commented 2 months ago

For interlinker and augmentation, I think I would keep them both and maybe reshuffle functions. For me metadata cleaning would logically fall under augmentation, as this doesn't create links between identical, similar or related items.

Also I agree with Rob that NLP/LLM support (the AI/ML part of metadata augmentation) will be part of it. Although fornow I would not add dedicated functions for it. I would rather think that AI/ML would complement, enforce or replace specific functions along the development path.

roblokers commented 1 month ago

For now / after discussion in the group we end up with the following (partly renamed) components and functions:

Knowledge graph

Augmented metadata to RDF transformation
Knowledge Graph enrichment and linking
Knowledge Graph querying (SPARQL endpoint)

Natural Language Querying

AI / LLM based KG generation from unstructured content
Chatbot - Natural Language Interface
LLM operationalisation

Metadata Augmentation

Translation module
Keywords matcher
Spatial Locator
Metadata cleaning
Spatial scope analyser
EUSO-high-value dataset tagging
Similarity finder
Automatic metadata interlinking

roblokers commented 1 month ago

Needs to be aligned with updated tech doc and model diagram by respective authors (@roblokers @pvgenuchten @DajanaSnopkova )

soilwise-he / SoilWise-documentation

Component renamings #25