kartevonmorgen / FairSync

A general Syncronisation Tool for kvm and other maps
1 stars 0 forks source link

Define FairSyncDB #23

Open wellemut opened 1 month ago

wellemut commented 1 month ago

Usecase Partnerplattform

As a Partnerplattform [like Gemeinschaftswerk #5 or wechange #88] I need a simple and "clean" way (import-file/ URL) to easily get all missing and improved entries (back) into my plattform withouth having to moderate and do deduplicate-check again (as I have not enough funding and not time) and not loosing quality or relevance for my users, as I dont want to compete with kvm

Simpler und konkreter: Als Moderator einer eigenen low-budget Themenkarte, die ihre Daten (per csv) mit der kvm teilt, brauche ich einen einfachen Weg, um aktualisierte und bei mir fehlende daten von der kvm zurück zu bekommen ohne jedes Mal manuell hunderte Einträge auf dupletten und updates zu prüfen.

Problem

If FairSync only consumes data from all other platforms and merges all duplicates into a final set of entries,

Solution

No matter in which frontend imports are moderated and duplicates are identified and merged, it needs a Databasis to

  1. Save the likelyness/ similarity between alle entries (https://github.com/kartevonmorgen/FairSync/issues/20, https://github.com/kartevonmorgen/FairSync/issues/22)
  2. And to save manual decisions moderators have done, if entries are the same ore unique

This FairSyncDB does not need to save a copy of every entry in the networ, but

  1. the Adress, where an Entry is found
  2. the Unique ID, Version and last date edited OR a unique hash
  3. Maybe basic informations like titel, place... which are needed for duplicat checking
  4. And the result: Possible duplicates and their similarity to this entry

This FairSyncDB needs to serve other Databasis with APIs

  1. Send your entry, and we tell you how much similarities it has to other entries in the FairSync System
  2. Send your entry-ID and we send you updates we have for that entry.

The FairSyncDB does not serve full search results of unique entries. If someone wants to get one set of duplicate free entries for any topic or region, he can use ofdb-API.

wellemut commented 1 month ago

This DB-Question is discussed since the beginning and mainly defined by @flosse in sprint 10.

revisions will be don on that in sprint 11.