YaleDHLab / voynich

Analyzing the Voynich Manuscript with computer vision
https://github.com/YaleDHLab/voynich/projects/1
7 stars 1 forks source link

draft workflow for comment #12

Closed chirila closed 5 years ago

chirila commented 5 years ago

@duhaime I've put together a mindmap of the components of the project. The points in bold are the ones I think are a reasonable aim for a 3 week RPG. Could you comment? How much image metadata will need to be manually compiled?

Voynich Workflow.pdf

duhaime commented 5 years ago

This looks great!

Yes, I think we should be able to move through these issues in the allotted time. We might be able to dive more or less deeply into particular areas depending on how the data collection goes. I think I'll start moving those operations to the compute cluster to expedite.

In terms of starting places once we have the data, I wanted to propose an alternative to pixplot. The DHLab has been working on a viewer that uses the same computer vision foundation as pixplot, but that makes it easier to navigate through the results in an analytic fashion. We're calling this project neural neighbors. There's a demo app running here if you're curious: https://s3-us-west-2.amazonaws.com/lab-apps/neural-neighbors-redux/index.html (It's not optimized for performance yet, but that will be easy to do as it just involves shrinking images).

I thought we could add a new layout to that app that would allow users to proceed with the core use case for the Voynich images. Essentially, I think we want to analyze the Voynich + images from all other repositories we can grab, then ask: which images from the Voynich are quite similar to non-Voynich images?

I recently needed to perform essentially this exact task, asking: which images by publisher X are quite similar to images by any publisher except X? I used a crazy little web application that looked like this to browse my data (the first image in each row was by publisher X; each subsequent image was by a different publisher):

press-piracy-image-surfer

I thought it would be worthwhile coding up an interface like this for neural neighbors, and to use that to scour through the results of the image processing. This interface would allow one to adjust toggles that specify the image similarity metric (etc. color profile similarity, wasserstein distance, perceptual hash distance, inception vector distance, etc.), set a minimum threshold of image similarity to display, filter the Voynich ms images to show, and so forth. We could come up with a full specification of the features that would be useful, then wireframe with @mongmedia and then build the interface.

I can see this interface actually performing many of the analytic functions outlined in the analytic branch of the workflow document you attached, but I wanted to ask: does this sound like a good direction @chirila?

chirila commented 5 years ago

Sounds great! I've made those changes to the draft application.

duhaime commented 5 years ago

Awesome! Once we get the paperwork officiated we'll kick this project into high gear!

duhaime commented 5 years ago

I'm going to close this early sketch as we've matured into a fuller vision