Open eliso7 opened 2 years ago
Windows is only supported by other Windows users. But are you trying the latest version of the code from this repository? There was a problem earlier that caused segfaults. Also try a memory setting like 2.5G in case there's some 32-bit weirdness.
Why is this happening? python3 generate_lm.py --input_txt data.txt --output_dir . --top_k 2 --kenlm_bins
/mnt/c/Users/eliso/speech2text/STT/kenlm/build/bin/ --arpa_order 5 --max_arpa_memory "85%" --arpa_prune "0|0|1" --binary_a_bits 255 --binary_q_bits 8 --binary_type trie
Converting to lowercase and counting word occurrences ... | |# | 198 Elapsed Time: 0:00:00
Saving top 2 words ...
Calculating word statistics ... Your text file has 398 words in total It has 3 unique words Your top-2 words are 85.1759 percent of all words Your most common word "sentence" occurred 199 times The least common word in your top-k is "another" with 140 times The first word with 199 occurrences is "sentence" at place 0
Creating ARPA file ... === 1/5 Counting and sorting n-grams === Reading /mnt/c/Users/eliso/speech2text/STT/data/lm/lower.txt.gz ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100 Traceback (most recent call last): File "generate_lm.py", line 232, in
main()
File "generate_lm.py", line 216, in main
build_lm(args, data_lower, vocab_str)
File "generate_lm.py", line 99, in build_lm
subprocess.check_call(subargs)
File "/usr/lib/python3.8/subprocess.py", line 364, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/mnt/c/Users/eliso/speech2text/STT/kenlm/build/bin/lmplz', '--order', '5', '--temp_prefix', '.', '--memory', '85%', '--text', './lower.txt.gz', '--arpa', './lm.arpa', '--discount_fallback', '--prune', '0', '0', '1']' died with <Signals.SIGSEGV: 11>.