kpu / kenlm

KenLM: Faster and Smaller Language Model Queries
http://kheafield.com/code/kenlm/
Other
2.5k stars 513 forks source link

segmentation error (core dump) in inserting to trie (in wav2letter script) #313

Open wahyubram82 opened 3 years ago

wahyubram82 commented 3 years ago

Hello, I'm wondering..., how to solve segmentation error in inserting spelling_idxs, word_idx, score to trie?

the script is in here, line 156.

what i do is described in here.

the script is read the lexicon file, contains: word spelling letter from word, example:

BETTER B E T T E R |
GOOD G O O D |
ANOTHER A N O T H E R |

when iterating that lexicon list (word and the spelling word) , It always error in my script. the error is segmentation fault (core dump).

can somebody from the kenlm dev giving us a clue to solve it.

by the way, I build binary by clean the word. no character or something else, event number, already change to word. just word and space, then create binary from it.

not creating a trie file like in deepspeech, because, wav2letter / fairseq not ask me to do that.

before it, really appreciate for the clue...