CDRH / earlywashingtondc

OSCYS Rails site
http://earlywashingtondc.org
2 stars 0 forks source link

enslaved.org integration #239

Open jduss4 opened 3 years ago

jduss4 commented 3 years ago

We have been asked to contribute data to enslaved.org in the form of a CSV.

Without further information, from their search page I am assuming they are interested in:

We could likely line up the OSCYS data as:

I do not believe all of the information we would want to send them is in Solr, unfortunately, so we would likely need to write a script that got some results from Solr and combined them with personography files and the TTL file.

Documents should be good to go either from Solr or from the TEI files, as they list person ids and case ids. Cases are something we can get entirely from Solr, as they are aggregated from documents. People are tricky because there is likely more information in the personography than Solr (need to confirm) and there is also a lot of relationship information by way of our TTL file.

What we need to find out:

karindalziel commented 2 years ago

I have more details about this, I think this is the way forward to start with:

Write scripts in any form that's useful (ruby, xslt, python) and save them in the scripts folder (https://github.com/CDRH/data_oscys/tree/main/scripts) in an "enslaved.org_scripts" (or something) folder. save outputted files in https://github.com/CDRH/data_oscys/tree/main/output/data_export (we will need to update the readme for that folder and the filenames some)

nichgray commented 2 years ago

Based on the request for source information, which is only included in document files, we are likely to need an additional document dataset.