dbpedia / databus

A digital factory platform for managing files online with stable IDs, high-quality metadata, powerful API and tools for building on data: find, access, make interoperable, re-use
Apache License 2.0
36 stars 16 forks source link

Feature: Search / MOSS #59

Open kurzum opened 1 year ago

kurzum commented 1 year ago

Ideally, we provide a data search layer over the Databuses and Mod data. This was already aptly labelled MOSS. https://github.com/dbpedia/databus/issues/58 and https://github.com/dbpedia/databus/issues/57 tackle data acquisition and discovery and where it is stored.

Q1: What's the plan for the prototype now?

Q2: What are existing generic approaches and standards to combine searches (server or client-side)?

yum-yab commented 1 year ago

Q1: What's the plan for the prototype now?

Currently the plan for the prototype is (as far as I know, @holycrab13 may correct me):

  1. We (or someone else) sets up a specifically configured lookup, indexing metadata from one specific mod -> these should be publicly available
    • These regularly rebuild their index (e.g. cronjob) to reflect the current status of the specific mod
    • encourages following standards, e.g. publishing metadata in the frictionless format because there is already a mod that picks it up and a lookup that automatically indexes it
    • returns Databus Identifiers based on metadata of the mod
  2. If someone wants to include certain mod metadata in its Databus search, multiple lookup endpoints can be added/deleted/managed in the settings
    • Everyone can write metadata to everything, but by doing it this way the search/view of each individual user does not get cluttered
    • There can be a list of top-curated Mods that can be added by everyone, showcased in the Databus itself
  3. Once someone searches, the client queries the Databus itself, but also every lookup endpoint configured in his settings, merges the results by best effort and presents it to the user
    • no more work for the Databus server - all hard work is performed by each individual client