Closed alecristia closed 6 years ago
Ok this is a bug thank you for reporting.
Humm... this is working for me. What are the exact commands you used? From bash the option is -P/--probability, the -p option is for phone separator..
(wordseg) mathieu@deaftone:~/dev/wordseg$ head -5 test/data/prepared.txt | wordseg-baseline -P 1
ay m iy n dh ax k aa p s aa r jh ah s t l uh k ax ng f ao r p iy p ax l dh ae t l uh k y ah ng g er
t eh n p iy p ax l k ao l s ow sh iy z l ay k ih t s iy z iy sh iy z l ay k ay g eh t p ey d t ax
v eh r iy ae k t ax v ax n ah m
m iy n y ae dh eh r w aa z ax t ay m ih t w aa z ax p aa r t ah v aw er k ah l ch er w iy d ih d n iy d t ax hh ah n t t ax iy t
m uw v t ax ax s ax b er b ax n eh r iy ax ax n t ih l k ax l ah m b ax s g eh t s dh eh r ae k t t ax g eh dh er
(wordseg) mathieu@deaftone:~/dev/wordseg$ head -5 test/data/prepared.txt | wordseg-baseline -P 0
aymiyndhaxkaapsaarjhahstluhkaxngfaorpiypaxldhaetluhkyahngger
tehnpiypaxlkaolsowshiyzlaykihtsiyziyshiyzlaykaygehtpeydtax
vehriyaektaxvaxnahm
miynyaedhehrwaazaxtaymihtwaazaxpaartahvawerkahlcherwiydihdniydtaxhhahnttaxiyt
muwvtaxaxsaxberbaxnehriyaxaxntihlkaxlahmbaxsgehtsdhehraekttaxgehdher
sorry, you're right - closing this now
p=0 should treat each utt as a word, p=1 should cut at every phone (or syll)
p=0 sample
y uwaar g aa nax hh ay ddh ax bl aak ih nmay shert aar yuw owyuwkiy p kl ahng k ax ngy ao r hh ehd ay thih ngk y uw aa r geht axngt ay erd ow yae
p=.5
y uwaar g aan axhhay ddh axb laakih nmay sher taar y uw ow yuw kiypk l ah ngk ax ng yaor hheh d ay th ih ng k yuw aar geh t axng t ay erd owy ae
p=1
y uw aargaan axhhayd dhaxb l aakihnmay sh ert aary uw owyuwk iyp kl ahng kaxngy ao rhheh d ay thihngky uw aa rg eh tax ngtay er d ow y ae