viking-sudo-rm / voynich2vec

Applying word2vec embeddings to the problem of deciphering the Voynich manuscript.
7 stars 0 forks source link

Semi-supervised (vs. unsupervised) embedding alignment #10

Open viking-sudo-rm opened 6 years ago

viking-sudo-rm commented 6 years ago

I've come across some papers that approach the same task as MUSE (aligning word embeddings in two languages), except that they start with a small vocabulary of aligned words and try to bootstrap the rest of the alignment from that.

This is could be relevant for Voynich because the unsupervised task is probably much less data-hungry than the task MUSE was approaching, which is meant for large-scale datasets for information retrieval systems.

viking-sudo-rm commented 6 years ago

For example, Artetxe et al. (2017) bootstrap from aligning numbers only. So month names might be enough to work with in Voynich.