morfologik / polimorfologik

Scripts for preprocessing morfologik data.
39 stars 9 forks source link

Where to get polimorfologik.txt? #3

Closed dweiss closed 9 years ago

dweiss commented 9 years ago

After a move from SourceForge the prebuilt packages of morfologik source data are no longer available.

To get the source of the Polish dictionary (former polimorfologik.txt) you can:

git clone --depth 1 git@github.com:morfologik/morfologik-stemming.git
java -jar lib/morfologik-tools-2.0.1.jar dict_decompile -i morfologik-stemming/morfologik-polish/src/main/resources/morfologik/stemming/polish/pl.dict
morfologik-stemming/morfologik-polish/src/main/resources/morfologik/stemming/polish/pl.input

I will also attach the latest binaries of the "synth" dictionary to this project.

dweiss commented 9 years ago

I also added polimorfologik.txt (the source of the dictionary) to this repo.

dweiss commented 9 years ago

Couldn't do it, there is a file limit of 100mb and large files extension didn't work for me. Adding a binary release

dweiss commented 9 years ago

https://github.com/morfologik/polimorfologik/releases/download/archival%2Fsourceforge%2F2.0.0/morfologik.zip

tomsien commented 8 years ago

Where can I find new dict_synth ? I tried to rebuild it but after executing command java -jar morfologik-tools-2.0.1\lib\morfologik-tools-2.0.1.jar tab2morph -nw -i synt.txt -o synt_in.txt

I got Invalid argument: com.beust.jcommander.MissingCommandException: Expected a comma nd, got tab2morph I have only fsa_build, fsa_dump, fsa_info, dict_compile, dict_decompile commands available.

dweiss commented 8 years ago

Apologies for kicking in late. The synthesis dictionary (for Polish) is available as part of polimorfologik's distribution now.