MycroftAI / padatious

A neural network intent parser
http://padatious.readthedocs.io
Apache License 2.0
159 stars 40 forks source link

test_train_timeout_subprocess fail randomly #15

Open pgajdos opened 5 years ago

pgajdos commented 5 years ago

Hello,

test in question is failing with ca 30% probability in our build system. I have extraxted following testcase:

from time import monotonic

import os
import random

from padatious.intent_container import IntentContainer

cont = IntentContainer('temp')
cont.add_intent('a',
        [' '.join(random.choice('abcdefghijklmnopqrstuvwxyz') for _ in range(5))
            for __ in range(300)])
cont.add_intent('b',
        [' '.join(random.choice('abcdefghijklmnopqrstuvwxyz') for _ in range(5))
            for __ in range(300)])

for x in range(10):
    a = monotonic()
    assert not cont.train_subprocess(timeout=0.1)
    b = monotonic()
    print (b - a)

When I run it, I had got for example:

 0.47674093791283667
 0.5609202678315341
 0.5488572919275612
 6.474134984891862
 0.4769664751365781
 0.45290810498408973
 0.470392829971388
 0.4690805918071419
 0.46847033803351223
 0.4608854129910469
MatthewScholefield commented 5 years ago

Sorry for the late reply. Curious, what platform is this build system running on?

pgajdos commented 5 years ago

It is 32-bit or 64-bit linux. I do not remember much; when I run it on live system, I am getting in verbatim:

$ python3 test.py
Some objects timed out while training
Took too long to train a
Took too long to train b
0.46342682399972546
Some objects timed out while training
0.5481992479999462
Some objects timed out while training
0.474773013000231
Some objects timed out while training
0.5743695310002295
Some objects timed out while training
0.4770706409999548
Some objects timed out while training
0.5607837820007262
Some objects timed out while training
Regenerated b.
Regenerated a.
6.3757639979994565
0.45383565700012696
0.4491451869998855
0.46560020400011126
$

Unfortunately I do not understand the module or neural networks more to be sure I do not do anything wrong. But the fact, that the test is failing in certain percent of runs seem to be correct. Currently, we are just skipping the test.

See build logs: 32-bit, 64-bit

Note that in the build log the test value is only slightly more than 1s, so this might be a different issue than above. The build system may be slower than my live system. We can either assign better worker for this task or skip the test entirely.

What do you suggest?