biblissima / collatinus

Sources of Collatinus software - Latin lemmatizer, morphological analyzer and scansion
http://outils.biblissima.fr/en/collatinus
GNU General Public License v3.0
67 stars 15 forks source link
latin latin-language lemmatization lemmatizer morphological-analysis

Collatinus 11

The currently availlable version is the version 11.2, from the "Medieval" branch.

Sources of Collatinus software, the Latin lemmatizer, morphological analyzer and scansion tool. Sources of version 10 are available on collatinus-10-src.

Collatinus is a free, open source and multi-OS software (Mac, Windows et Debian GNU/Linux), that is easy to install and use.

Download page on the Biblissima website: http://outils.biblissima.fr/en/collatinus/ (binaries available for Mac OS, GNU/Linux and Windows).

Collatinus is both a lemmatiser and a morphological analyser for Latin texts: if a conjugated or declined form of a word is entered, it is capable of finding the correct root word to search for in the dictionary and then displaying its translation into another language, its different meanings, and any other information usually found in dictionaries.

In practice, Collatinus will be useful mostly for Latin teachers and professors who can quickly generate a complete lexical aid for any text and distribute it to their students. Students often use Collatinus as a reference when reading Latin texts, as they develop their vocabulary and language skills.

Main features

Project History

Originally, Collatinus was meant to produce printed documents, and it is still used for this purpose. Further improvements and adjustments were made when it became apparent that many people were using it for other purposes:

  1. as a lexical and morphological reference when reading a Latin text disposer,
  2. for lexical and stylistic searches,
  3. to provide students with exercises based on Latin texts.

How it Works

Unlike the majority of lemmatisers, which use lists of inflected forms, Collatinus uses a lexicon containing the lemmas and all the necessary information for their inflection. The advantage to this approach is that Collatinus, with its 11,000 lemmas, is capable of recognising over half a million forms. Adding lemmas with spelling variants (such as medieval spellings, for example) would make it possible to recognise all of their inflected forms as well.

Starting from a lemma and its associated flexional endings, Collatinus is also capable of displaying the corresponding inflection tables, which Latin learners may find useful.

Finally, when syllable quantities are known for a given lemma, Collatinus can scan the word and even the entire text. When scanning a text, Collatinus applies the usual rules of elision and hiatus.

Documentation

The help pages of Collatinus are also available on the web site of Biblissima.

The technical documentation collected with Doxygen can be found on the web site of Biblissima. Of course, any developer can collect it also from the sources.

Licence

Collatinus is developed and maintained by Yves Ouvrard and Philippe Verkerk. It is made available under the GNU GPL v3 licence.