Signsofliteracy / Signoff

Tools for the study of historical literacy
http://signsofliteracy.org/
8 stars 0 forks source link

WORKSHOP TOPIC: User tag driven IIIF manifests #8

Open Addaci opened 6 years ago

Addaci commented 6 years ago

MarineLives & Signs of Literacy (our new community to study historical literacy is doing some serious thinking about creating IIIF manifests to display markes, initials and signatures contained in manuscript pages. The functionality we are interested in is to be able to display text areas (or image areas) within a manuscript image and/or its matching full text transcription which contain relevant markes, initials and signatures, as well as to display the whole image page or whole matching full text transcription page. We want to be able to create IIIF manifests, which will pull up relevant markes, initials and signatures from multiple institutions with IIIF servers and content. For example the British Library and the Stadsarchief Amsterdam.

We are thinking how to semantically annotate or tag the images or text pages, and the specific image or text regions within the whole pages, so that manifests can be created.

The sort of tags or annotations we are thinking of are simple

e.g. Occupation [wine cooper; mariner; shipwright] e.g. Type of signoff [marke; initial(s); signature] e.g. Place of residence [e.g. Wapping; Cadiz] e.g. Date of signoff [e.g. March 13th, 1629]

A specific signoff (marke, initial or signature) could have multiple tages, e.g. text reading "Jo Bloggs, mariner, living in Wapping, aged 23" in a deposition dated March 13th 1629 could be tagged

mariner; Wapping; age 23; 1629

Our application envisages dealing with tens of thousands of legal records from the English High Court of Admiralty (TNA) and the Amsterdam notarial archives (Stadsarchief Amsterdam). Probably a minimum of 50,000 images, but could be many more.

We want to be able to crowdsource the tagging of the signoffs, having used Transkribus, with its line and taxt area recognition capability (or some other technology) to create an XML pixel level map of where the signoff is on the page of the manuscript.

image

image

The tagging data, in the case of a High Court of Admiralty deposition, will actually be derived from the start of the deposition, which could be up to three or five pages before, though it is usually on the same page, with the signoff at the end of the deposition.

image

Prior to crowdsourcing this tagging, we would want to take all the admiralty court and notarial images and upload them to one or more IIIF servers.

That's how far we have got. I have had an initial discussion with Digirati about this, but have parked the discussion for the moment, lacking funding. I will be giving a paper at the IIIF Washington DC conference, May 21-25th, at which I will be laying out a vision for greater integration of IIIF and Transkribus ecosystems, and making a pitch for the development of the above functionality I have described.

I would be very interested to hear the Signsofliteracy community's thoughts on this, and to see what sort of design solutions you come up with. We have some interest from the Technical Director of Pelagios/Recogito, @rsimon at the Austrian Institute of Technology, for this sort of functionality.

Addaci commented 6 years ago

A second related idea we are looking at is user created shareable tag driven IIIF manifests. User created as opposed to project or archival created, though these are also important. Would probably want an authorship label in the manifest metadata to show author.

Users LOVE personal and themed boards. Witness Pinterest. But our concept is to make the boards or manifests independent of viewing platform. So you could view through any IIIF compatible viewer, such as Universal Viewer or Mirador, not be forced to cut and paste your favourite playbills from @alexandermendes LibCrowds developed at the British Library Your tags and ideally your and other people's semantic annotations would be available through your chosen IIIF viewer for interrogation.

How would it work? Here is an example: a user could select the tags:

This would generate a manifest containing all IIIF served images containing those tags.

We would have a simple pre-specified ontology of tags. In this case lighterman is in the class of the tag ocupation; thames is in the class of the tag place of work; wapping is in the class of the tag place of residence; signoff is a class tag.

The class tag signoff would contain four primary component or base tags. They would be

The ontology of the place of residence class tag is more complex, since in our case it needs to allow for

Because Signs of Literacy intends to be a comparative multi-country user driven community with multiple archival contributors of images and a mixture of general public and academic users, we need to think carefully about the conceptual design of the above.

We also need to think through the relationship between tags and annotations. As we engage with the IIIF consortium, with the help of IIIF technical cordinator Glen Robson and with the help of the Recogito annotation platform team, we hope to develop this thinking.

We would love to involve LibCrowds in this discussion and to learn from your experience and experimentation with tag driven IIIF manifests.

I would be delighted to include tag driven IIIF manifests as a specific discussion point on the agenda of our technical tools for exploring historical literacy workshop on June 5th 2018 in Amsterdam.

Addaci commented 6 years ago

I am pasting below two helpful responses from the @LibCrowds team and my further responses to their comments.

LibCrowds comments in LibCrowds Issues

Add user tags for tasks to enable shared image albums #631

Hi @Addaci, I'm waiting to have a conversation about this internally when people are back in the office, so apologies for the slow reply, I will get back to this!

It sounds like there could be some interesting opportunities for collaboration but ultimately it's going to come down to a matter of resources. We may not be able to commit that much time to developing this feature - in terms of development work it's only me working on this, as you've probably noticed!

However, I do hope to build this in a generic enough way that it could be useful for other projects. As mentioned before, the solution here probably involves generating IIIF Annotation lists for each tag, or set of tags. A reference to these lists could then be included in the original manifests.

Actually, we're already part of the way to this being implemented, in some form. We already have a way of creating one specific type of tag per project (e.g. see our 'Mark the titles' projects). These tags are serialised as Web Annotations but the plan is to also make them available as IIIF Annotation lists. It might be interesting to look at the LibCrowds data model.

If you were to use your own server, it should hopefully be pretty easy to get an instance of LibCrowds up and running. But I'll also check with others about the possibility of setting up a collection microsite on ours.

Addaci's comments in LibCrowd issues

No problem. Feel free to call me to chat about, or I'm happy to come into the British Library. I put the idea forward to the LibCrowds team for discussion, and even if there is no internal interest, I am still very happy to contribute further to the development of tagging/annotation capability tied to user creation of IIIF manifests. As further context, I am beginning to work informally with the Pelagios Commons team, including @rsimon, their technical director, who is driving the development of Recogito. This is not a formal partnership, but we will see where it goes. Pelagios/Recogito are interested in getting closer to GLAMs and to IIIF. I have introduced Rainer Simon to @Glenrobson, technical director of IIIF, and they are now in discussion. I will also be giving a paper at the IIIF Washington DC conference on 'Creating an IIIF/Transkribus/Recogito enabled manuscript community to explore C17th literacy'. We are having a @signsofliteracy discussion on overlapping ecosystems, including IIIF and Glen Robson response (issue 5) and on user tag driven IIF manifests (issue 8)

@alexandermendes @mialondon Many thanks for this further response. Fully understand resource issues. The collaboration idea(s) is a medium to long term idea, kicking off in 2019, and would include funding from Chronoscopic Education (which is applying end May 2018 for charitable incorporated status). In the short term, we have tech constraints. Potentially, as a favour, I could ask Rainer Simon, lead developer at Pelagios Commons if he could help us install a IIIF server as a favour - presumably we need an annotation server as well? Or I could pay Digirati to do this. I could even try Klokan Technologies through their IIIF server service to see if they would help, presumably again paid. Ideal, though, since I only want to do a tiny demo (max 40 images), would be to be able to use a collection microsite - I would put up (say) twenty English High Court of Admiralty images and twenty Alle Amsterdamser Akten (Amsterdam notarial archives) images, and show the functionality in principal. I would ideally demo this at the IIIF conference in Washington, May 21-25, 2018, with full credit to LibCrowds. If that timing is too quick, I would demo it for the first time at our June 5th workshop at the Stadsarchief Amsterdam. I am at early stage of getting the Huygens ING institute and the Digital Humanities Lab KNAW Humanities Cluster interested in collaboration in 2019 in one or more projects to be structured and grant money to be raised under the Signs Of Literacy umbrella, and I am sure they would also be interested in this tag driven IIIF manifest functionality.