weaviate / semantic-search-through-wikipedia-with-weaviate

Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine
MIT License
241 stars 21 forks source link

Disambiguation improvements #3

Closed loretoparisi closed 2 years ago

loretoparisi commented 2 years ago

Hu guys, thanks for this outstanding work! I was trying your GraphQL demo endpoint and I had this disambiguation issue. For the query Marc Marquez (the moto racer) I'm getting the wrong document result, while adding more context (like motorcycle racer) I get the correct result. Assumed that I cannot have for this item in the query more context, but "related wikipedia items" (let's say Valentino Rossi), is it possible to add a related item node embedding for this query?

{
  Get {
    Paragraph(
      ask: {
        question: "Marc Marquez"
        properties: ["content"]
      }
      limit: 1
    ) {
      content
      order
      title
      inArticle {
        ... on Article {
          title
        }
      }
      _additional {
        answer {
          result
        }
      }
    }
  }
}

result:

{
  "data": {
    "Get": {
      "Paragraph": [
        {
          "_additional": {
            "answer": {
              "result": null
            }
          },
          "content": "On November 10, 2017, Marquez was signed to the Detroit Lions' practice squad. He was promoted to the active roster on November 27, 2017. On September 11, 2018, Marquez was waived by the Lions. ",
          "inArticle": [
            {
              "title": "Bradley Marquez"
            }
          ],
          "order": 5,
          "title": "Detroit Lions"
        }
      ]
    }
  }
}
bobvanluijt commented 2 years ago

Hi @loretoparisi – sorry for the late response. I must have missed this question.

is it possible to add a related item node embedding for this query?

I'm not 100% sure I understand, can you please elaborate a bit more?

bobvanluijt commented 2 years ago

Btw – you might also like to join our Slack channel.

loretoparisi commented 2 years ago

Hi @loretoparisi – sorry for the late response. I must have missed this question.

is it possible to add a related item node embedding for this query?

I'm not 100% sure I understand, can you please elaborate a bit more?

Thanks for your help! So, since using the bare query Marc Marquez the result is wrong, while adding more context (like motor racer ) it works ok, I was wondering if it is possible to add this context not as text, but as en embedding representation (let's say using the wikidata property for pilot or motor racer, etc.) OR a related Wikidata entity, like for this specific case MotoGP or maybe the entity Honda Racing Corporation, in order to provide more context to disambiguate the sole entity given.

bobvanluijt commented 2 years ago

That's an interesting idea @loretoparisi, thanks – I've moved the discussion here.