Weeks-UNC / shapemapper2

Public repository for ShapeMapper 2 releases
Other
29 stars 16 forks source link

OS/pipeline troubles #20

Open benoahb opened 3 years ago

benoahb commented 3 years ago

Hi there,

I would just like to report that I've been trying to set up the pipeline on different distributions of linux (ubuntu 20.04, 18.04, 16.04 or debian 10) and only the last one, debian, worked without failure when running./run_example.sh.

See below for details:


    [==========] Sequence variant correction
    [----------]
    [ RUN      ] Sequence variant 0: unchanged
    [       OK ] Sequence variant 0: unchanged
    [ RUN      ] Sequence variant 1: ambig del G near 5-prime end
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 1: ambig del G near 5-prime end
    [ RUN      ] Sequence variant 2: ambig del G near 5-prime at different location
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGGGCCAAGGACUCGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 2: ambig del G near 5-prime at different location
    [ RUN      ] Sequence variant 3: ambig del G at high-background position
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 3: ambig del G at high-background position
    [ RUN      ] Sequence variant 4: unambig del C near 5-prime
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 4: unambig del C near 5-prime
    [ RUN      ] Sequence variant 5: ambig del T near 5-prime
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 5: ambig del T near 5-prime
    [ RUN      ] Sequence variant 6: mismatch G->T near 5-prime
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGUGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 6: mismatch G->T near 5-prime
    [ RUN      ] Sequence variant 7: double mismatch GC->AT near 5-prime
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGGAUCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 7: double mismatch GC->AT near 5-prime
    [ RUN      ] Sequence variant 8: insert CCA towards 5-prime end
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGGGCCAAGGACUCGGGGCCAUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 8: insert CCA towards 5-prime end
    [ RUN      ] Sequence variant 9: GT->C toward 5-prime
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGGGCCAAGGACUCGGGCGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 9: GT->C toward 5-prime
    [ RUN      ] Sequence variant 10: GT->C toward 5-prime, del T nearby, GT->CA toward 3-prime
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGGGCCAAGGACUCGGGCGCCCUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGCAUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 10: GT->C toward 5-prime, del T nearby, GT->CA toward 3-prime
    [ RUN      ] Sequence variant 11: mismatch G->T next to del A
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGGGCCAAGUCUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 11: mismatch G->T next to del A
    [ RUN      ] Sequence variant 12: mismatch T->C next to insert A
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGGGCCAAGGACUCGGGGCAGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 12: mismatch T->C next to insert A
    [----------]
    [==========]
    [  PASSED  ] 1 tests for sequence variant correction.
    [  FAILED  ] 12 tests.
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    Running area under ROC curve tests . . .

    [==========] ROC tests
    [----------]
    [ RUN      ] ROC test 0: 2-sample rRNAs, bowtie2
    Run time: 58s
    small subunit AUC: 0.732
    large subunit AUC: 0.701
    [  FAILED  ] ROC test 0: 2-sample rRNAs, bowtie2
    [ RUN      ] ROC test 1: 2-sample rRNAs, STAR
    Run time: 45s
    small subunit AUC: 0.732
    large subunit AUC: 0.699
    [  FAILED  ] ROC test 1: 2-sample rRNAs, STAR
    [----------]
    [==========]
    [  PASSED  ] 0 tests for end-to-end pipeline success.
    [  FAILED  ] 2 tests.
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    0 / 98  c++ unit test(s) failed.
    0 / 29  end-to-end success test(s) failed.
    0 / 63  module failure detection test(s) failed.
    12 / 13 sequence variant correction test(s) failed.
    2 / 2   area under ROC curve test(s) failed.
    ----------
    14 / 205    total test(s) failed.
    FAILURE

Running sequence variant correction tests . . .

    [==========] Sequence variant correction
    [----------]
    [ RUN      ] Sequence variant 0: unchanged
    [       OK ] Sequence variant 0: unchanged
    [ RUN      ] Sequence variant 1: ambig del G near 5-prime end
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 1: ambig del G near 5-prime end
    [ RUN      ] Sequence variant 2: ambig del G near 5-prime at different location
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGGGCCAAGGACUCGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 2: ambig del G near 5-prime at different location
    [ RUN      ] Sequence variant 3: ambig del G at high-background position
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 3: ambig del G at high-background position
    [ RUN      ] Sequence variant 4: unambig del C near 5-prime
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 4: unambig del C near 5-prime
    [ RUN      ] Sequence variant 5: ambig del T near 5-prime
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 5: ambig del T near 5-prime
    [ RUN      ] Sequence variant 6: mismatch G->T near 5-prime
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGUGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 6: mismatch G->T near 5-prime
    [ RUN      ] Sequence variant 7: double mismatch GC->AT near 5-prime
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGGAUCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 7: double mismatch GC->AT near 5-prime
    [ RUN      ] Sequence variant 8: insert CCA towards 5-prime end
    [       OK ] Sequence variant 8: insert CCA towards 5-prime end
    [ RUN      ] Sequence variant 9: GT->C toward 5-prime
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGGGCCAAGGACUCGGGCGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 9: GT->C toward 5-prime
    [ RUN      ] Sequence variant 10: GT->C toward 5-prime, del T nearby, GT->CA toward 3-prime
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGGGCCAAGGACUCGGGCGCCCUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGCAUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 10: GT->C toward 5-prime, del T nearby, GT->CA toward 3-prime
    [ RUN      ] Sequence variant 11: mismatch G->T next to del A
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGGGCCAAGUCUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 11: mismatch G->T next to del A
    [ RUN      ] Sequence variant 12: mismatch T->C next to insert A
    ERROR: Corrected sequence does not match original sequence. GGCCUUCGGGCCAAGGACUCGGGGUGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC GGCCUUCGGGCCAAGGACUCGGGGCAGCCCUUCUGCGUGAAGGCUGAGAAAUACCCGUAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUUCUCGAUCCGGUUCGCCGGAUCCAAAUCGGGCUUCGGUCCGGUUC
    [  FAILED  ] Sequence variant 12: mismatch T->C next to insert A
    [----------]
    [==========]
    [  PASSED  ] 2 tests for sequence variant correction.
    [  FAILED  ] 11 tests.
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    Running area under ROC curve tests . . .

    [==========] ROC tests
    [----------]
    [ RUN      ] ROC test 0: 2-sample rRNAs, bowtie2
    Run time: 127s
    small subunit AUC: 0.732
    large subunit AUC: 0.705
    [  FAILED  ] ROC test 0: 2-sample rRNAs, bowtie2
    [ RUN      ] ROC test 1: 2-sample rRNAs, STAR
    Run time: 82s
    small subunit AUC: 0.732
    large subunit AUC: 0.699
    [  FAILED  ] ROC test 1: 2-sample rRNAs, STAR
    [----------]
    [==========]
    [  PASSED  ] 0 tests for end-to-end pipeline success.
    [  FAILED  ] 2 tests.
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    0 / 98  c++ unit test(s) failed.
    0 / 29  end-to-end success test(s) failed.
    0 / 63  module failure detection test(s) failed.
    11 / 13 sequence variant correction test(s) failed.
    2 / 2   area under ROC curve test(s) failed.
    ----------
    13 / 205    total test(s) failed.
    FAILURE
All tests passed
SUCCESS
shapemapper commented 3 years ago

Interesting. I know I definitely didn't set up conda environment activation quite correctly, so I can believe that on some platforms system libraries could creep in and create conflicts, especially with the test scripts. I also seem to recall there being possible issues locating the shapemapper base directory when running inside a docker container, but I thought I had resolved that. I don't have the bandwidth to explore this much myself, but if you are able I would be interested to see the output if you add the following lines to internals/test/variant_correction_tests.sh after line 27, and just run that script in one of the ubuntu containers and see what these commands output.

echo "BASE_DIR: $BASE_DIR"
echo "PYTHONPATH: $PYTHONPATH"
echo 'which python3: '
which python3
echo 'which shapemapper: '
which shapemapper
echo 'which STAR: '
which STAR
exit
shapemapper commented 3 years ago

Ah okay now I'm seeing my note in line 26 of shapemapper where I hacked around the problem of locating the base directory while inside a docker container by just passing it as an environment variable at build/test time. That's probably at least part of the issue you're encountering. If you need to get it working in an ubuntu docker container, the quickest hack fix is probably to hardcode a path in all of the shell scripts that try to locate THIS_DIR and/or BASE_DIR, or add a line to pass it in as an environment variable.