welfare-state-analytics / riksdagen-corpus

Swedish parliamentary proceedings - Riksdagens protokoll 1867-today
Other
26 stars 5 forks source link

Adding sources/quality levels/provenance to the corpus #421

Open MansMeg opened 10 months ago

MansMeg commented 10 months ago

@ljo and Nina mentioned the need for other researchers to be able to assess the quality levels of different parts of the data/corpus.

I have seen PROV (https://www.w3.org/TR/2013/NOTE-prov-primer-20130430/) as one way to do this. We also have tei that was built partly with this purpose. At the same time I dont think all researchers nessecarily wants this, so we should probably keep it as separate metadata. But it is crucial for interactions with wikidata and long-term credibility.

@ljo do you have any takes on this? What is your use case.

salgo60 commented 10 months ago

Wikidata is based on a EU project RENDER and don't follow Prov is my understanding it is designed to handle different values for facts from different sources

image

image

The Wikidata database is possible to set up yourself using Wikibase or use a cloud version Wikibase.cloud

salgo60 commented 10 months ago

Quality levels / Trust

I wrote 2019 We need a better model communicating quality/relevance of sources in Wikidata / Provenance and asked Denny the designer of WIkidata what he think

image

Document sources

I guess a step one is to find a way to document sources

image