So far, the data model for the sources has developed in three stages:
Minimal ad hoc model in the Django ORM during the "skinny" stage.
Ad hoc one-on-one translation of the existing ORM fields to RDF properties during the intial move to triple stores (#145).
Mirroring of most of these fields to Elasticsearch (under abbreviated names) for full-text search (#333 and #357).
The ad-hocness of the model is illustrated by two open issues (at the time of writing), #266 and #373.
The virtual London meeting made clear that this model cannot stay in the long run, even after the above issues are closed. There are several (ultimate) needs that the existing data model cannot address and that call for a more carefully designed "source ontology":
Metadata about the context in which the source was obtained.
Import of community-contributed experiences through the digital postcards and the chatbot.
Import of tweets and social media postings in general.
Import of scraped material from e.g. review sites.
Image-based sources.
To be clear, this is only about what type of sources can be described by the data model; this is not about adding supporting features to the interface. The addition of sources and metadata of the above types can happen through the existing API endpoints. Searching, displaying and annotating images is completely out of the scope of the current project. From the point of view of the interface, adapting to the definitive data model will only involve field name changes and (perhaps) making the upload form a bit more flexible.
Arriving at a final, future-proof data model is non-urgent, but important enough that it should be implemented before the end of the project. Most of the design work can probably be deferred to consortium partners. We should reach out to all stakeholders soon in order to arrive at a plan:
François (general design expertise, CIDOC-CRM, #266)
Guillaume (scraped materials)
Alessio (chatbot, ontology for context of source curation)
Alex Stan (postcards, is currently working on submitting them to UK-RED)
Gustavo (tweets and social media postings in general)
Request for discussion @BeritJanssen @JeltevanBoheemen
So far, the data model for the sources has developed in three stages:
The ad-hocness of the model is illustrated by two open issues (at the time of writing), #266 and #373.
The virtual London meeting made clear that this model cannot stay in the long run, even after the above issues are closed. There are several (ultimate) needs that the existing data model cannot address and that call for a more carefully designed "source ontology":
To be clear, this is only about what type of sources can be described by the data model; this is not about adding supporting features to the interface. The addition of sources and metadata of the above types can happen through the existing API endpoints. Searching, displaying and annotating images is completely out of the scope of the current project. From the point of view of the interface, adapting to the definitive data model will only involve field name changes and (perhaps) making the upload form a bit more flexible.
Arriving at a final, future-proof data model is non-urgent, but important enough that it should be implemented before the end of the project. Most of the design work can probably be deferred to consortium partners. We should reach out to all stakeholders soon in order to arrive at a plan:
Request for discussion @BeritJanssen @JeltevanBoheemen