viking-sudo-rm / voynich2vec

Applying word2vec embeddings to the problem of deciphering the Voynich manuscript.
7 stars 0 forks source link

Self alignment in known languages #6

Open viking-sudo-rm opened 6 years ago

viking-sudo-rm commented 6 years ago

Quoting @chirila in #4:

Can you do a bunch of "self" alignments for these languages? If we end up with consistent patterns in what types of morphology align, that would be a way to make some guesses about Voynich morphology.

@chirila what do you have in mind for this analysis? Should we just run it and look for patterns in the data manually, or is there some kind of more sophisticated analysis to do?

chirila commented 6 years ago

I think most of it would need to be manual (unless we have access to parsers for enough of the languages that we can identify the morphology that's contributing). Some things that would be useful to quantify:

. Levenshtein distance for nearest neighbors

Is the closeness of the match comparable from text to text? That is, does a similarity of 0.001 in one text mean the same thing as in another? If so, it would be helpful to know the profiles of the top matches for various thresholds.

I wonder if we can build a profile of what pronouns look like. This is probably not going to work, but could we take a word, look at its profile of neighbors, and use that to train a classifier? For example, pronouns should have lots of neighbors that aren't otherwise very close to one another

.

On Tue, Jun 5, 2018 at 9:58 PM, Will Merrill notifications@github.com wrote:

Quoting @chirila https://github.com/chirila in #4 https://github.com/viking-sudo-rm/voynich2vec/issues/4:

Can you do a bunch of "self" alignments for these languages? If we end up with consistent patterns in what types of morphology align, that would be a way to make some guesses about Voynich morphology.

@chirila https://github.com/chirila what do you have in mind for this analysis? Should we just run it and look for patterns in the data manually, or is there some kind of more sophisticated analysis to do?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/viking-sudo-rm/voynich2vec/issues/6, or mute the thread https://github.com/notifications/unsubscribe-auth/AP8oRztn8cZQYHX5aJe6_vjuE9-o2djFks5t5zc-gaJpZM4Ub3K9 .

--

Claire Bowern Professor, Director of Graduate Studies Chair: Yale Women Faculty Forum (wff.yale.edu) Department of Linguistics New Haven, CT 06511