bootphon / wordseg

A Python toolbox for text based word segmentation
https://docs.cognitive-ml.fr/wordseg
GNU General Public License v3.0
16 stars 7 forks source link

change data for test, add tests for syllables #27

Closed alecristia closed 6 years ago

alecristia commented 6 years ago

can we please use: https://github.com/bootphon/wordseg/blob/new-docs/test/data/orthographic.txt https://github.com/bootphon/wordseg/blob/new-docs/test/data/tagged.txt for the test?

And add tests to catch problems specific to syllables (like #25 )?

mmmaat commented 6 years ago

300 lines is really big for a test, can you select a subsample? Or we can restrict those 300 lines to syllables test of TP (#25)?

alecristia commented 6 years ago

Sure, but the current test has 1k lines:

   5 ag_testeng.yld

   1 ag_testeng1.yld

   0 ag_testeng2.yld

  28 ag_testengger.lt

mmmaat commented 6 years ago

This is a very special test, used one or two times. Usually the tests are done with 10 utterances defined in test/init.py: https://github.com/bootphon/wordseg/blob/65aa776eb203119169a216cbb93af03968eb7fa1/test/__init__.py#L9

alecristia commented 6 years ago

I see! In that case, could we use for the test the first 10 lines of that document (tagged.txt)?

I'll still use the whole thing for the tutorial. ​

mmmaat commented 6 years ago

For the syllable test, I made a 2 utts text that reproduce the bug. And I will replace the old phonologic.txt by your new tagged.txt in the tests.