metrumresearchgroup / bbi

Next generation modeling platform
https://metrumresearchgroup.github.io/bbi/docs
12 stars 2 forks source link

TestBBIExpandsWithoutPrefix failing intermittently #218

Open seth127 opened 3 years ago

seth127 commented 3 years ago

TestBBIExpandsWithoutPrefix is failing roughly 10% of the time that it runs in drone. Additionally, TestBBIExpandsWithPrefix fails occasionally too (<5% of the time).

I reproduced this on a 2 vCPU workflow, by running the test 100x in a bash loop. Interestingly, when run by itself, TestBBIExpandsWithoutPrefix failed only 2 of 100 times, but when I ran all three tests in that file, it failed 9 of 100 times and TestBBIExpandsWithPrefix failed an additional 2 times.

Running only TestBBIExpandsWithoutPrefix

export MPIEXEC_PATH="/usr/bin/mpiexec"
export NONMEMROOT="/opt/NONMEM"
export NMVERSION="nm74gf"
export SGE="true"
export POST_EXECUTION="true"
export NMQUAL="true"
export LOCAL="true"
export ROOT_EXECUTION_DIR="/tmp/bbisethtest/"

for i in $(seq 1 100)
do
    echo "@@@@@@ ${i} @@@@@@"
    go test -v -run TestBBIExpandsWithoutPrefix
done

Got 2 failures

Running TestBBIExpandsWithoutPrefix, TestBBIExpandsWithPrefix, and TestBBIExpandsWithPrefixToPartialMatch

export MPIEXEC_PATH="/usr/bin/mpiexec"
export NONMEMROOT="/opt/NONMEM"
export NMVERSION="nm74gf"
export SGE="true"
export POST_EXECUTION="true"
export NMQUAL="true"
export LOCAL="true"
export ROOT_EXECUTION_DIR="/tmp/bbisethtest/"

for i in $(seq 1 100)
do
    echo "@@@@@@ ${i} @@@@@@"
    go test -v -run TestBBIExpandsWithoutPrefix
done

Got 11 total failures.

Results of these tests (containing the failure messages) are in the comment below

seth127 commented 3 years ago

here are the full logs from the 100 tests ExpandsWitSTAR.log ExpandsWithoutPrefix.log

seth127 commented 3 years ago

Here is an example of one of the failures. The relevant error seems to be [102] output details were: File 102.ctl does not exist. Not sure why this file doesn't exist (I assume it means the one in the child dir that is being targeted by the shell script) but my guess is that it's some kind of race condition where the 102.sh gets invoked before the 102.ctl has been copied through. Need to look into it further.

@@@@@@ 1 @@@@@@
=== RUN   TestBBIExpandsWithoutPrefix
time="2021-05-26T11:18:19-04:00" level=info msg="Beginning work with /tmp/bbisethtest/working as the root"
time="2021-05-26T11:18:55-04:00" level=info msg="Command was 'bbi' while arguments were -d --threads 2 nonmem run local --nm_version nm74gf /tmp/bbisethtest/working/bbi_expansion/model/10[1:5].ctl"
time="2021-05-26T11:18:55-04:00" level=error msg="An error occurred trying to execute model. Error details are : exit status 1"
time="2021-05-26T11:18:55-04:00" level=error msg="Exit code was 1, details were exit status 1"
time="2021-05-26T11:18:55-04:00" level=error msg="output details were: time=\"2021-05-26T11:18:19-04:00\" level=info msg=\"Successfully loaded default configuration from /data/home/sethg/bbi_testing/bbitest/bbi.yaml\"
time=\"2021-05-26T11:18:19-04:00\" level=info msg=\"Beginning Local Path\"
time=\"2021-05-26T11:18:19-04:00\" level=info msg=\"Setting logging to DEBUG\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"Locating models from arguments\"
time=\"2021-05-26T11:18:19-04:00\" level=info msg=\"expanding model pattern: /tmp/bbisethtest/working/bbi_expansion/model/10[1:5].ctl \\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"expanded models: [/tmp/bbisethtest/working/bbi_expansion/model/101.ctl /tmp/bbisethtest/working/bbi_expansion/model/102.ctl /tmp/bbisethtest/working/bbi_expansion/model/103.ctl /tmp/bbisethtest/working/bbi_expansion/model/104.ctl /tmp/bbisethtest/working/bbi_expansion/model/105.ctl] \\"
time=\"2021-05-26T11:18:19-04:00\" level=info msg=\"A total of 5 models have completed the initial preparation phase\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"Building turnstile manager\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"Beginning turnstile execution\"
time=\"2021-05-26T11:18:19-04:00\" level=info msg=\"Manager Details: Working: 0, Errors: 0, Completed: 0, Concurrency: 2, Iterations: 5\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[101] Beginning local preparation phase\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[101] Overwrite is currently set to true\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[101] Writing initial gitignore file\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"Requested save of config file /tmp/bbisethtest/working/bbi_expansion/model/101\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[101] Creating local execution script\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[101] beginning script command generation. NMQual is set to false\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"Generated command template for local execution is: #!/bin/bash\\#$ -wd /tmp/bbisethtest/working/bbi_expansion/model/101\\/opt/NONMEM/nm74gf/run/nmfe74 101.ctl  101.lst \\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[101] Writing script to file\"
time=\"2021-05-26T11:18:19-04:00\" level=info msg=\"[101] Beginning local work phase\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"Output directory is currently set to /tmp/bbisethtest/working/bbi_expansion/model/101\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"Script location is pegged at /tmp/bbisethtest/working/bbi_expansion/model/101/101.sh\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[101] Generated command was: /tmp/bbisethtest/working/bbi_expansion/model/101/101.sh\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[102] Beginning local preparation phase\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[102] Overwrite is currently set to true\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[102] Writing initial gitignore file\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"Requested save of config file /tmp/bbisethtest/working/bbi_expansion/model/102\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[102] Creating local execution script\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[102] beginning script command generation. NMQual is set to false\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"Generated command template for local execution is: #!/bin/bash\\#$ -wd /tmp/bbisethtest/working/bbi_expansion/model/102\\/opt/NONMEM/nm74gf/run/nmfe74 102.ctl  102.lst \\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[102] Writing script to file\"
time=\"2021-05-26T11:18:19-04:00\" level=info msg=\"[102] Beginning local work phase\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"Output directory is currently set to /tmp/bbisethtest/working/bbi_expansion/model/102\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"Script location is pegged at /tmp/bbisethtest/working/bbi_expansion/model/102/102.sh\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[102] Generated command was: /tmp/bbisethtest/working/bbi_expansion/model/102/102.sh\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[103] Beginning local preparation phase\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[103] Overwrite is currently set to true\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[103] Writing initial gitignore file\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"Requested save of config file /tmp/bbisethtest/working/bbi_expansion/model/103\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[103] Creating local execution script\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[103] beginning script command generation. NMQual is set to false\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"Generated command template for local execution is: #!/bin/bash\\#$ -wd /tmp/bbisethtest/working/bbi_expansion/model/103\\/opt/NONMEM/nm74gf/run/nmfe74 103.ctl  103.lst \\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[103] Writing script to file\"
time=\"2021-05-26T11:18:19-04:00\" level=info msg=\"[103] Beginning local work phase\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"Output directory is currently set to /tmp/bbisethtest/working/bbi_expansion/model/103\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"Script location is pegged at /tmp/bbisethtest/working/bbi_expansion/model/103/103.sh\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"[103] Generated command was: /tmp/bbisethtest/working/bbi_expansion/model/103/103.sh\"
time=\"2021-05-26T11:18:19-04:00\" level=debug msg=\"exit status 106\"
time=\"2021-05-26T11:18:19-04:00\" level=error msg=\"[102] Exit code was 106, details were exit status 106\"
time=\"2021-05-26T11:18:19-04:00\" level=error msg=\"[102] output details were: File 102.ctl does not exist.\\"
time=\"2021-05-26T11:18:22-04:00\" level=info msg=\"Manager Details: Working: 2, Errors: 1, Completed: 1, Concurrency: 2, Iterations: 5\"
time=\"2021-05-26T11:18:25-04:00\" level=info msg=\"Manager Details: Working: 2, Errors: 1, Completed: 1, Concurrency: 2, Iterations: 5\"
time=\"2021-05-26T11:18:28-04:00\" level=info msg=\"Manager Details: Working: 2, Errors: 1, Completed: 1, Concurrency: 2, Iterations: 5\"
time=\"2021-05-26T11:18:31-04:00\" level=info msg=\"Manager Details: Working: 2, Errors: 1, Completed: 1, Concurrency: 2, Iterations: 5\"
time=\"2021-05-26T11:18:34-04:00\" level=info msg=\"Manager Details: Working: 2, Errors: 1, Completed: 1, Concurrency: 2, Iterations: 5\"
time=\"2021-05-26T11:18:37-04:00\" level=info msg=\"Manager Details: Working: 2, Errors: 1, Completed: 1, Concurrency: 2, Iterations: 5\"
time=\"2021-05-26T11:18:37-04:00\" level=info msg=\"[103] Beginning cleanup phase\"
time=\"2021-05-26T11:18:37-04:00\" level=debug msg=\"[103] Beginning hash calculation operations for data file\"
time=\"2021-05-26T11:18:37-04:00\" level=debug msg=\"[103] Beginning selection of cleanable / copiable files\"
time=\"2021-05-26T11:18:37-04:00\" level=info msg=\"[103] Attempting to locate copiable files. Provided path is /tmp/bbisethtest/working/bbi_expansion/model/103\"
time=\"2021-05-26T11:18:37-04:00\" level=debug msg=\"[103] Beginning selection of copiable files \"
time=\"2021-05-26T11:18:37-04:00\" level=debug msg=\"[103] Beginning local cleanup operations\"
time=\"2021-05-26T11:18:37-04:00\" level=debug msg=\"[103] Writing finalized gitignore file\"
time=\"2021-05-26T11:18:37-04:00\" level=debug msg=\"[103] Writing out configuration as json into /tmp/bbisethtest/working/bbi_expansion/model/103\"
time=\"2021-05-26T11:18:37-04:00\" level=debug msg=\"Post Work Executable set as \"
time=\"2021-05-26T11:18:37-04:00\" level=info msg=\"[103] Cleanup completed\"
time=\"2021-05-26T11:18:37-04:00\" level=debug msg=\"[104] Beginning local preparation phase\"
time=\"2021-05-26T11:18:37-04:00\" level=debug msg=\"[104] Overwrite is currently set to true\"
time=\"2021-05-26T11:18:37-04:00\" level=debug msg=\"[104] Writing initial gitignore file\"
time=\"2021-05-26T11:18:37-04:00\" level=debug msg=\"Requested save of config file /tmp/bbisethtest/working/bbi_expansion/model/104\"
time=\"2021-05-26T11:18:37-04:00\" level=debug msg=\"[104] Creating local execution script\"
time=\"2021-05-26T11:18:37-04:00\" level=debug msg=\"[104] beginning script command generation. NMQual is set to false\"
time=\"2021-05-26T11:18:37-04:00\" level=debug msg=\"Generated command template for local execution is: #!/bin/bash\\#$ -wd /tmp/bbisethtest/working/bbi_expansion/model/104\\/opt/NONMEM/nm74gf/run/nmfe74 104.ctl  104.lst \\"
time=\"2021-05-26T11:18:37-04:00\" level=debug msg=\"[104] Writing script to file\"
time=\"2021-05-26T11:18:37-04:00\" level=info msg=\"[104] Beginning local work phase\"
time=\"2021-05-26T11:18:37-04:00\" level=debug msg=\"Output directory is currently set to /tmp/bbisethtest/working/bbi_expansion/model/104\"
time=\"2021-05-26T11:18:37-04:00\" level=debug msg=\"Script location is pegged at /tmp/bbisethtest/working/bbi_expansion/model/104/104.sh\"
time=\"2021-05-26T11:18:37-04:00\" level=debug msg=\"[104] Generated command was: /tmp/bbisethtest/working/bbi_expansion/model/104/104.sh\"
time=\"2021-05-26T11:18:38-04:00\" level=info msg=\"[101] Beginning cleanup phase\"
time=\"2021-05-26T11:18:38-04:00\" level=debug msg=\"[101] Beginning hash calculation operations for data file\"
time=\"2021-05-26T11:18:38-04:00\" level=debug msg=\"[101] Beginning selection of cleanable / copiable files\"
time=\"2021-05-26T11:18:38-04:00\" level=info msg=\"[101] Attempting to locate copiable files. Provided path is /tmp/bbisethtest/working/bbi_expansion/model/101\"
time=\"2021-05-26T11:18:38-04:00\" level=debug msg=\"[101] Beginning selection of copiable files \"
time=\"2021-05-26T11:18:38-04:00\" level=debug msg=\"[101] Beginning local cleanup operations\"
time=\"2021-05-26T11:18:38-04:00\" level=debug msg=\"[101] Writing finalized gitignore file\"
time=\"2021-05-26T11:18:38-04:00\" level=debug msg=\"[101] Writing out configuration as json into /tmp/bbisethtest/working/bbi_expansion/model/101\"
time=\"2021-05-26T11:18:38-04:00\" level=debug msg=\"Post Work Executable set as \"
time=\"2021-05-26T11:18:38-04:00\" level=info msg=\"[101] Cleanup completed\"
time=\"2021-05-26T11:18:38-04:00\" level=debug msg=\"[105] Beginning local preparation phase\"
time=\"2021-05-26T11:18:38-04:00\" level=debug msg=\"[105] Overwrite is currently set to true\"
time=\"2021-05-26T11:18:38-04:00\" level=debug msg=\"[105] Writing initial gitignore file\"
time=\"2021-05-26T11:18:38-04:00\" level=debug msg=\"Requested save of config file /tmp/bbisethtest/working/bbi_expansion/model/105\"
time=\"2021-05-26T11:18:38-04:00\" level=debug msg=\"[105] Creating local execution script\"
time=\"2021-05-26T11:18:38-04:00\" level=debug msg=\"[105] beginning script command generation. NMQual is set to false\"
time=\"2021-05-26T11:18:38-04:00\" level=debug msg=\"Generated command template for local execution is: #!/bin/bash\\#$ -wd /tmp/bbisethtest/working/bbi_expansion/model/105\\/opt/NONMEM/nm74gf/run/nmfe74 105.ctl  105.lst \\"
time=\"2021-05-26T11:18:38-04:00\" level=debug msg=\"[105] Writing script to file\"
time=\"2021-05-26T11:18:38-04:00\" level=info msg=\"[105] Beginning local work phase\"
time=\"2021-05-26T11:18:38-04:00\" level=debug msg=\"Output directory is currently set to /tmp/bbisethtest/working/bbi_expansion/model/105\"
time=\"2021-05-26T11:18:38-04:00\" level=debug msg=\"Script location is pegged at /tmp/bbisethtest/working/bbi_expansion/model/105/105.sh\"
time=\"2021-05-26T11:18:38-04:00\" level=debug msg=\"[105] Generated command was: /tmp/bbisethtest/working/bbi_expansion/model/105/105.sh\"
time=\"2021-05-26T11:18:40-04:00\" level=info msg=\"Manager Details: Working: 2, Errors: 1, Completed: 3, Concurrency: 2, Iterations: 5\"
time=\"2021-05-26T11:18:43-04:00\" level=info msg=\"Manager Details: Working: 2, Errors: 1, Completed: 3, Concurrency: 2, Iterations: 5\"
time=\"2021-05-26T11:18:46-04:00\" level=info msg=\"Manager Details: Working: 2, Errors: 1, Completed: 3, Concurrency: 2, Iterations: 5\"
time=\"2021-05-26T11:18:49-04:00\" level=info msg=\"Manager Details: Working: 2, Errors: 1, Completed: 3, Concurrency: 2, Iterations: 5\"
time=\"2021-05-26T11:18:52-04:00\" level=info msg=\"Manager Details: Working: 2, Errors: 1, Completed: 3, Concurrency: 2, Iterations: 5\"
time=\"2021-05-26T11:18:55-04:00\" level=info msg=\"Manager Details: Working: 2, Errors: 1, Completed: 3, Concurrency: 2, Iterations: 5\"
time=\"2021-05-26T11:18:55-04:00\" level=info msg=\"[104] Beginning cleanup phase\"
time=\"2021-05-26T11:18:55-04:00\" level=debug msg=\"[104] Beginning hash calculation operations for data file\"
time=\"2021-05-26T11:18:55-04:00\" level=debug msg=\"[104] Beginning selection of cleanable / copiable files\"
time=\"2021-05-26T11:18:55-04:00\" level=info msg=\"[104] Attempting to locate copiable files. Provided path is /tmp/bbisethtest/working/bbi_expansion/model/104\"
time=\"2021-05-26T11:18:55-04:00\" level=debug msg=\"[104] Beginning selection of copiable files \"
time=\"2021-05-26T11:18:55-04:00\" level=debug msg=\"[104] Beginning local cleanup operations\"
time=\"2021-05-26T11:18:55-04:00\" level=debug msg=\"[104] Writing finalized gitignore file\"
time=\"2021-05-26T11:18:55-04:00\" level=debug msg=\"[104] Writing out configuration as json into /tmp/bbisethtest/working/bbi_expansion/model/104\"
time=\"2021-05-26T11:18:55-04:00\" level=debug msg=\"Post Work Executable set as \"
time=\"2021-05-26T11:18:55-04:00\" level=info msg=\"[104] Cleanup completed\"
time=\"2021-05-26T11:18:55-04:00\" level=info msg=\"[105] Beginning cleanup phase\"
time=\"2021-05-26T11:18:55-04:00\" level=debug msg=\"[105] Beginning hash calculation operations for data file\"
time=\"2021-05-26T11:18:55-04:00\" level=debug msg=\"[105] Beginning selection of cleanable / copiable files\"
time=\"2021-05-26T11:18:55-04:00\" level=info msg=\"[105] Attempting to locate copiable files. Provided path is /tmp/bbisethtest/working/bbi_expansion/model/105\"
time=\"2021-05-26T11:18:55-04:00\" level=debug msg=\"[105] Beginning selection of copiable files \"
time=\"2021-05-26T11:18:55-04:00\" level=debug msg=\"[105] Beginning local cleanup operations\"
time=\"2021-05-26T11:18:55-04:00\" level=debug msg=\"[105] Writing finalized gitignore file\"
time=\"2021-05-26T11:18:55-04:00\" level=debug msg=\"[105] Writing out configuration as json into /tmp/bbisethtest/working/bbi_expansion/model/105\"
time=\"2021-05-26T11:18:55-04:00\" level=debug msg=\"Post Work Executable set as \"
time=\"2021-05-26T11:18:55-04:00\" level=info msg=\"[105] Cleanup completed\"
time=\"2021-05-26T11:18:55-04:00\" level=info msg=\"Waiting for any post execution hooks to finish\"
time=\"2021-05-26T11:18:55-04:00\" level=debug msg=\"Work has completed. Beginning detail display via console\"
time=\"2021-05-26T11:18:55-04:00\" level=error msg=\"1 errors were experienced during the run\"
time=\"2021-05-26T11:18:55-04:00\" level=error msg=\"Errors were experienced while running model 102.ctl. Details are Running the programmatic shell script caused an error\"
time=\"2021-05-26T11:18:55-04:00\" level=info msg=\"\\r5 models completed in 36.674844163s\"

"
    bbi_expansion_test.go:40: 
            Error Trace:    bbi_expansion_test.go:40
            Error:          Expected nil, but got: &exec.ExitError{ProcessState:(*os.ProcessState)(0xc0000b9520), Stderr:[]uint8(nil)}
            Test:           TestBBIExpandsWithoutPrefix
--- FAIL: TestBBIExpandsWithoutPrefix (36.71s)
=== RUN   TestBBIExpandsWithPrefix
time="2021-05-26T11:18:55-04:00" level=info msg="Beginning work with /tmp/bbisethtest/working as the root"
--- PASS: TestBBIExpandsWithPrefix (27.46s)
=== RUN   TestBBIExpandsWithPrefixToPartialMatch
time="2021-05-26T11:19:23-04:00" level=info msg="Beginning work with /tmp/bbisethtest/working as the root"
--- PASS: TestBBIExpandsWithPrefixToPartialMatch (18.45s)
FAIL
exit status 1
FAIL    bbitest 82.626s