gap-packages / io

GAP package IO to do input and output
https://gap-packages.github.io/io/
Other
14 stars 14 forks source link

Intermittent failure in `sendstringbackground.tst` in GitHub Actions macOS job #105

Closed wilfwilson closed 1 year ago

wilfwilson commented 2 years ago

Example 1, Example 2:

+ /Users/runner/gap/bin/gap.sh -l '/Users/runner/work/io/io/gaproot;' --quitonbreak --cover coverage/mMgjCe.coverage tst/testall.g
 ┌───────┐   GAP 4.12dev built on 2022-01-25 16:44:56+0000
 │  GAP  │   https://www.gap-system.org
 └───────┘   Architecture: x86_64-apple-darwin20.6.0-default64-kv8
 Configuration:  gmp 6.2.1, GASMAN
 Loading the library and packages ...
 Packages:   AClib 1.3.2, Alnuth 3.1.2, AtlasRep 2.1.0, AutoDoc 2020.08.11, 
             AutPGrp 1.10.2, CRISP 1.4.5, Cryst 4.1.24, CrystCat 1.1.9, 
             CTblLib 1.3.2, FactInt 1.6.3, FGA 1.4.0, Forms 1.2.6, 
             GAPDoc 1.6.4, genss 1.6.6, IO 4.7.2, IRREDSOL 1.4.3, 
             LAGUNA 3.9.3, orb 4.8.4, Polenta 1.3.9, Polycyclic 2.16, 
             PrimGrp 3.4.1, RadiRoot 2.8, recog 1.3.2, ResClasses 4.7.2, 
             SmallGrp 1.4.2, Sophus 1.24, SpinSym 1.5.2, TomLib 1.2.9, 
             TransGrp 3.3, utils 0.72
 Try '??help' for help. See also '?copyright', '?cite' and '?authors'
Architecture: x86_64-apple-darwin20.6.0-default64-kv8

testing: /Users/runner/work/io/io/gaproot/pkg/io/tst/all.tst
       6 ms (0 ms GC) and 198KB allocated for all.tst
testing: /Users/runner/work/io/io/gaproot/pkg/io/tst/bugfix.tst
     431 ms (12 ms GC) and 24.1MB allocated for bugfix.tst
testing: /Users/runner/work/io/io/gaproot/pkg/io/tst/children.tst
     555 ms (0 ms GC) and 4.25MB allocated for children.tst
testing: /Users/runner/work/io/io/gaproot/pkg/io/tst/funcs.tst
       3 ms (0 ms GC) and 65.9KB allocated for funcs.tst
testing: /Users/runner/work/io/io/gaproot/pkg/io/tst/sendstringbackground.tst
########> Diff in /Users/runner/work/io/io/gaproot/pkg/io/tst/sendstringb\
ackground.tst:2
# Input is:
IsBound(HPCGAP) or ForAll([1..3000], x -> IO_SendStringBackground(f, "cheese")\
);
# Expected output:
true
# But found:
#E Overflow in table of ignored processes#E Overflow in table of ign\
ored proce\
sses#E Overflow in table of ignored processes#E Overflow in table of ignored p\
\
rocesses#E Overflow in table of ignored processes#E Overflow in table of ignor\
\
ed processes#E Overflow in table of ignored processes#E Overflow in table of i\
\
gnored processes#E Overflow in table of ignored processes#E Overflow in table \
\
of ignored processes#E Overflow in table of ignored processes#E Overflow in ta\
\
ble of ignored processes#E Overflow in table of ignored processes#E Overflow i\
\
n table of ignored processes#E Overflow in table of ignored processes#E Overfl\
\
ow in table of ignored processes#E Overflow in table of ignored processes#E Ov\
\
erflow in table of ignored processes#E Overflow in table of ignored processes#\
\
E Overflow in table of ignored processes#E Overflow in table of ignored proces\
\
ses#E Overflow in table of ignored processes#E Overflow in table of ignored pr\
\
ocesses#E Overflow in table of ignored processes#E Overflow in table of ignore\
\
d processes#E Overflow in table of ignored processes#E Overflow in table of ig\
\
nored processes#E Overflow in table of ignored processes#E Overflow in table o\
\
f ignored processes#E Overflow in table of ignored processes#E Overflow in tab\
\
le of ignored processes#E Overflow in table of ignored processes#E Overflow in\
\
 table of ignored processes#E Overflow in table of ignored processes#E Overflo\
\
w in table of ignored processes#E Overflow in table of ignored processes#E Ove\
\

[...etc, omitted...]

########
     198 ms (0 ms GC) and 288KB allocated for sendstringbackground.tst
testing: /Users/runner/work/io/io/gaproot/pkg/io/tst/testgap.tst
    9827 ms (562 ms GC) and 1.26GB allocated for testgap.tst
testing: /Users/runner/work/io/io/gaproot/pkg/io/tst/timeout.tst
      85 ms (79 ms GC) and 111KB allocated for timeout.tst
testing: /Users/runner/work/io/io/gaproot/pkg/io/tst/waitpid.tst
      74 ms (72 ms GC) and 55.5KB allocated for waitpid.tst
-----------------------------------
total     11179 ms (725 ms GC) and 1.29GB allocated
              1 failures in 1 of 8 files

#I  Errors detected while testing

Error: Process completed with exit code 1.
wilfwilson commented 2 years ago

The GitHub Actions job uses macOS 11. On my own computer, which runs macOS 12, I do not have any problem.

fingolfin commented 1 year ago

This basically fails 90% of the time right now :-(.

@ChrisJefferson any idea what might be causing this?

fingolfin commented 1 year ago

Ohhhh, OK, so we do ForAll([1..3000], x -> IO_SendStringBackground(f, "cheese")); where f refers to /dev/null. And IO_SendStringBackground forks, and then does IO_IgnorePid -- but we can only "ignore" 1024 PIDs at a time. If the processes terminate quickly enough, that's fine, but if we spawn them faster than they die off, it is easy to see how this test could overflow that table.

So we could just reduce the 3000 to 1024 (or a bit smaller, to be safe). But perhaps @ChrisJefferson intentionally wanted to "overflow" the 'ignore pid' list??? but how is that supposed to work?

We could also sleep a little bit after each call to IO_SendStringBackground in that test, to increase the chance that the forked jobs can finish in time. But on a machine under heavy load (such as a CI VM...) this would still have a chance of breaking...

ChrisJefferson commented 1 year ago

Looking back, I think various tests were designed to check we "clean up" properly -- of course if we go too fast I can see how things won't get cleaned up properly.

I'd prefer fixing the failures, we could hypothetically make another test to check if things are "cleaned up" properly, which would probably have to have a bunch of 'waits' in it, to try to make sure the OS is given the chance to clean up. finished processes.