CIRCSE / LEMLAT3

Morphological analyzer and lemmatizer for Latin.
http://www.lemlat3.eu/
25 stars 2 forks source link

Provide the lemma DB raw data ? #16

Closed PonteIneptique closed 5 years ago

PonteIneptique commented 5 years ago

Hi there :) I have been seeing the project multiple time but one of the things that troubles me is the unability to find the list of lemma, say in a raw format like CSV/TSV. I think it would be pretty helpful to have access to this kind of list, as user of the application.

Cheers !

gersh0m commented 5 years ago

Hi, If you mean the list of all lemmas that Lemlat can generate they are all stored in the table 'lemmario' that you can find in the lemlat_db dump file: they are not 'raw' format actually, but it should be quite easy to extract them in the desired format (dump file it's just plain-standard sql). If you mean the list of the lemmas 'linked' to a specific list of forms you can get them directly in raw format giving the correct parameters to the provided application (see usage on the home page).

gfranzini commented 5 years ago

@PonteIneptique csv and tsv versions of the lemmario are now available here: https://github.com/CIRCSE/LEMLAT3/tree/master/lemlat_workspace/LemLat_Data

PonteIneptique commented 5 years ago

As we say in French : un prêté pour un rendu ! Or simply : thanks !

gfranzini commented 5 years ago

De rien! :)