Basic algorithms can of course be used. As WSD is not applied there, (optionally can be applied)
Similarity between two sentences
Longest common string etc.
Morphological Analyzer
jProcessing uses Cabocha, if your target is ancient Japanese text, then you should be able to separately train cabocha with your own training data.
and call it as it is via this python lib.
Finding parallel example sentences from Edict
There is no Japanese Sentiword Net.
So I used English SentiwordNet and mapped wordnet ids, then prepared polarity scores for Japanese lexicon (and entries in edict).
We cannot use the sentiment analysis work (though it does look interesting). Cabocha interesting, however do you know of any treebanks for Classical Japanese?
Hello,
Thank you for your wonderful library.
I run an open source project, the Classical Language Toolkit, which helps researchers do NLP in ancient and classical languages.
One of our contributors found your software and is interested in porting some of it for our users.
But because I do not know Japanese, I am interested to learn whether jProcessing is suitable for old Japanese texts (say, up until the year AD 1600).
Thanks again for sharing your software with the world. Feel free to be in touch with me directly at kyle@kyle-p-johnson.com if you prefer!