As a user, I want to be able to scrape a source which gives me both structured and unstructured data. For example, while scraping a procurement portal, I might want to download contract metadata, but also a contract document as a PDF file. While both things are possible in memorious, there is currently no way to make things show up in aleph such that the structured data record (e.g., a mapped Contract refers to the ingested Document by its ID).
To solve this, we need some mechanism for importing both the structured and unstructured content into the same collection in such a way that structured entities can refer to the documents by their ID.
As a user, I want to be able to scrape a source which gives me both structured and unstructured data. For example, while scraping a procurement portal, I might want to download contract metadata, but also a contract document as a PDF file. While both things are possible in memorious, there is currently no way to make things show up in aleph such that the structured data record (e.g., a mapped
Contract
refers to the ingestedDocument
by its ID).To solve this, we need some mechanism for importing both the structured and unstructured content into the same collection in such a way that structured entities can refer to the documents by their ID.