MetPX / sarracenia

https://MetPX.github.io/sarracenia
GNU General Public License v2.0
45 stars 22 forks source link

Flow Test Fails (Testing on master branch) #164

Closed ghost closed 4 years ago

ghost commented 5 years ago

Commit 5c91d54a69a6f0035ae723df40f8392a4edaf953

TEST RESULTS

| dd.weather routing |
test  1 success: sr_shovel   (3809) t_dd1 should have the same number of items as t_dd2  (3804)
test  2 FAILURE: sr_winnow   (5843) should have the same of the number of items of shovels   (7613)
test  3 success: sr_sarra    (2877) should have the same number of items as winnows'post     (3030)
test  4 success: sr_subscribe    (2875) should have the same number of items as sarra        (2877)
                 | watch      routing |
test  5 FAILURE: sr_watch        (2356) should be 4 times subscribe t_f30         (2875)
test  6 success: sr_sender       (2351) should have the same number of items as sr_watch  (2356)
test  7 success: sr_subscribe u_sftp_f60 (2234) should have the same number of items as sr_sender (2351)
test  8 success: sr_subscribe cp_f61     (2217) should have the same number of items as sr_sender (2351)
                 | poll       routing |
test  9 FAILURE: sr_poll test1_f62   (1686) should have half the same number of items of sr_sender   (2351)
test 10 success: sr_subscribe q_f71  (1685) should have the same number of items as sr_poll test1_f62 (1686)
                 | flow_post  routing |
test 11 FAILURE: sr_post test2_f61   (1607) should have half the same number of items of sr_sender   (2351)
test 12 success: sr_subscribe ftp_f70    (1388) should have the same number of items as sr_post test2_f61 (1607)
test 13 FAILURE: sr_post test2_f61   (1607) should have about the same number of items as shim_f63   (1250)
                 | py infos   routing |
test 14 FAILURE: sr_shovel pclean_f90 (15) should have the same number of watched items winnows' post    (3030)
test 15 FAILURE: sr_shovel pclean_f92 (2) should have the same number of removed items winnows' post     (3030)
test 16 success: 0 messages received that we don't know what happened.
test 17 success: count of truncated headers (2876) and subscribed messages (2876) should have about the same number of items
                 | C          routing |
test 18 success: cpump both pelles (c shovel) should receive about the same number of messages (2250) (2251)
test 19 success: cdnld_f21 subscribe downloaded (656) the same number of files that was published by both van_14 and van_15 (656)
test 20 success: veille_f34 should post twice as many files (1312) as subscribe cdnld_f21 downloaded (656)
test 21 success: veille_f34 should post twice as many files (1312) as subscribe cfile_f44 downloaded (656)
test 22 FAILURE: Overall 14 of 21 passed (sample size: 3028) !
ghost commented 5 years ago

Commit 5c91d54

TEST RESULTS

| dd.weather routing |
test  1 success: sr_shovel   (4328) t_dd1 should have the same number of items as t_dd2  (4323)
test  2 success: sr_winnow   (8651) should have the same of the number of items of shovels   (8651)
test  3 success: sr_sarra    (4067) should have the same number of items as winnows'post     (4318)
test  4 success: sr_subscribe    (4063) should have the same number of items as sarra        (4067)
                 | watch      routing |
test  5 FAILURE: sr_watch        (3193) should be 4 times subscribe t_f30         (4063)
test  6 success: sr_sender       (3193) should have the same number of items as sr_watch  (3193)
test  7 success: sr_subscribe u_sftp_f60 (3193) should have the same number of items as sr_sender (3193)
test  8 success: sr_subscribe cp_f61     (3193) should have the same number of items as sr_sender (3193)
                 | poll       routing |
test  9 FAILURE: sr_poll test1_f62   (2457) should have half the same number of items of sr_sender   (3193)
test 10 success: sr_subscribe q_f71  (2457) should have the same number of items as sr_poll test1_f62 (2457)
                 | flow_post  routing |
test 11 FAILURE: sr_post test2_f61   (2455) should have half the same number of items of sr_sender   (3193)
test 12 success: sr_subscribe ftp_f70    (2455) should have the same number of items as sr_post test2_f61 (2455)
test 13 success: sr_post test2_f61   (2455) should have about the same number of items as shim_f63   (2455)
                 | py infos   routing |
test 14 FAILURE: sr_shovel pclean_f90 (1454) should have the same number of watched items winnows' post  (4318)
test 15 FAILURE: sr_shovel pclean_f92 (246) should have the same number of removed items winnows' post   (4318)
test 16 success: 0 messages received that we don't know what happened.
test 17 success: count of truncated headers (4067) and subscribed messages (4067) should have about the same number of items
                 | C          routing |
test 18 success: cpump both pelles (c shovel) should receive about the same number of messages (6767) (6774)
test 19 success: cdnld_f21 subscribe downloaded (1002) the same number of files that was published by both van_14 and van_15 (1002)
test 20 success: veille_f34 should post twice as many files (2004) as subscribe cdnld_f21 downloaded (1002)
test 21 success: veille_f34 should post twice as many files (2004) as subscribe cfile_f44 downloaded (1002)
test 22 FAILURE: Overall 16 of 21 passed (sample size: 4318) !
petersilva commented 5 years ago

the flow_test kind of sucks... it's flaky, but it's the best we have. for your specific complaint: It's fine for me. The commit you are pointing out just comments out some debug lines that have no effect. the flow test passes for me before and after that patch is applied. but yeah, this happens to me sometimes also, the flow_test is super finicky, and breaks for lots of reasons. You need to look into it. sample reasons: even though it is supposed to ignore other configurations, sometimes the result check mistakenly grabs log files from other configurations, which throws off the results. you can try deleting all the irrelevant log files. have a chat with Ben... he has been wrestling with such things recently.

when this happens, try checking out a revision that used to work, and see if it still does. If it doesn't, then for sure something in your environment changed. There may be something you changed in default.conf or admin.conf, or an account problem. these are possibilities, but it can take a while to figure out.

the other dumb thing to try: just do a ./flow_cleanup.sh and run the test again... sometimes the data from the datamart has a pattern that upsets the test... in general, it takes a few tries to confirm that something is really wrong.

so.. that's an overview of the kinds that can upset the flow_test, check them out, it may be one of those,or if you're really lucky, it may be something new...

good luck!

ghost commented 5 years ago

The mentioned commit is just the last commit on master when testing, I didn't mean to indicate that it was responsible for the observed results.

Thanks for clearing those points up. Noureddine told me something somewhat contradictory on Friday, which was the motivation for this issue in the first place. Let's have a chat at some point to clear things up!

petersilva commented 4 years ago

Closing this as moot now... the flow have changed a lot. now moved into travis-ci... and done on every commit. they still fail sporadically, but there is nothing to be gained from this particular report.