hillerlab / REforge

Regulatory Element Forward Genomics to associate transcription factor binding site divergence in regulatory elements with phenotypic differences between species
MIT License
8 stars 3 forks source link

Stubb issues? #8

Open jacobscott7071 opened 3 weeks ago

jacobscott7071 commented 3 weeks ago

Hello --

I've been trying to run REforge on your example data, but I keep running into the same error: WARNING:root:Error in score_sequence_with_stubb: Command 'stubb_noPseudoCount TACCCACACGTACTTTGAAACAATTCGGACGTCACTTTTAATCGAGGGACCGCGAGGAAGAGGTGCATAGTTAAATTCCTCAAACAAAGCAACTTCCGAACTGGGGTAAAAGGCGACAACCTAAAAACGGGATATCCCTCAAAGGTGAGGGCCTCGAGAATTTGGATCGATTTATAATGACTCCTAAAGTGCGGACTGTACTACGATCCGATTAATTGGCGACTCGTGTCACTATAGTTACAACCGGGGGACGCGATTATAGTAAGCGTTATATATGAGACTCTTGTTTAGGGGACTATTGCCACATGCACCGGTTGGGATATGCTAACATTCGATGGCCCTGACCACGGTTTTTATTACATTTCCGTGCGTTACGAATGGGACCGAGATGAGTAATTCTTGCTAGTCAATATATAATAAATTTCTAGTTCAACTTTTCTTTAATGAGTAGCTTTCGGGCACCCTAGCTAAGGATATTTACAATAGAGCCGCTAAAGGTTCTAAGTCCTGTGTTTTTCAAACGTCATATGCTCTATGTGGGGCGTTGCATTGCTACTACCTGATTGACATAGTGATCTGCTGTAGTTCTGTGGTTAAATCTTAATTCGCAAATCTTGATTTATTCGTAGAGCAGAAGACCCTAGGCCTGAAGCATCATGTCTCTTCAAGAATCATACTATACGTAGTACAATTTTCCTTATGGGTTTATCAGTCAACAAATCGCTATTTTAAATAGAGATTCATCTTCTACAGATATCACCTGTTATGGTGAAGCGGTCTACTGGCATCTACGAA motifs 500.0 1 -brief -seq' died with <Signals.SIGKILL: 9>.

We were able to narrow the problem down to something going wrong with Stubb. The Stubb link on the REforge page has depreciated, so we installed it from here: https://github.com/UIUCSinhaLab/Stubb/

Do you happen to know what could be causing this? Alternatively, would you be able to share your version of Stubb with us so that we could test if the software version is part of the problem?

MichaelHiller commented 3 weeks ago

Hi,

thanks for letting us know that the Stubb link expired. It would be important to apply the Stubb.patch, as Bjoern changed the code to make it work for REforge.

@bjorn.langer@crg.eu Where did you download the original Stubb code? We should update the link. And can you recap this issue with this sequence?

Thx

jacobscott7071 commented 3 weeks ago

Thanks for the quick reply. We get this error after applying your Stubb patch. Your instructions for installation were very clear, which is partially why we think the issue may be related to Stubb version. I should also clarify that we get this same error for every single sequence in the data set.

MichaelHiller commented 3 weeks ago

Thx, so a systematic error.

@bjorn.langer@crg.eu Could you pls send the Stubb src? And maybe we upload this as a tar.gz since there are problems fetching the original Stubb code.

bjlang commented 3 weeks ago

Hi, thanks for the notification on the update of the Stubb download link. However, AFAICT the Stubb code is still identical to the one we based our patch on. Meaning, unfortunately I cannot reproduce the error you're encountering. Calling REforge_branch_scoring.py with -d will print all individual stubb calls. Maybe running one of these commands manually helps pinpointing the problem. Did you test the example data also with the motif file provided in the example folder? (I'm asking because I notice the missing .wtmx extension of the motif file in your error message)

jacobscott7071 commented 3 weeks ago

Yes, I tested the example data with the motif file provided. The error above was from an attempt with another motif file, but both tries gave the same problem. Per your advice, I tried running the example with -d. I'm not sure that I was able to learn anything new from it, but perhaps your more experienced eyes would be able to interpret something useful? INFO:root:branch scoring of element_sequences/el137_10000.fa with data/motifs.wtmx DEBUG:root:tree_doctor data/tree_simulation.nwk -antN -P bosTau7,canFam3,cavPor3,choHof1,choHof1_loss,equCab2,equCab2_loss,eriEur1,felCat5,felCat5-canFam3,felCat5-equCab2,felCat5-eriEur1,hg19,hg19-micMur1,hg19-panTro4,hg19-rheMac3,hg19-tupBel1,loxAfr3,loxAfr3-choHof1,loxAfr3-triMan1,micMur1,mm10,mm10-cavPor3,mm10-hg19,mm10-loxAfr3,mm10-oryCun2,mm10-rn5,mm10-susScr3,oryCun2,oryCun2_loss,panTro4,papHam1,rheMac3,rheMac3-papHam1,rn5,susScr3,susScr3-felCat5,susScr3-turTru2,triMan1,tupBel1,turTru2,turTru2-bosTau7 INFO:root:Branch scoring: use background/ as background and 200 as window size INFO:root:checking mm10-loxAfr3 with Stubb, scrCrrMthd: stubb DEBUG:root:background /blue/cohn/ja.scott/REforge/REforge/example/background//gc0.49/gc0.49.fa will be used DEBUG:root:Iterative score correction DEBUG:root:stubb_noPseudoCount AGTGGTAGTAGGCTGTCAAGGCAATATGAGGACATTCGGGACCGTTGTCGGACTCCCGCCTGTACGGAGCTCGACTGGAGTTAGGGAGCATCAAGATATAAATTCAGGATCACTCTGAACCTATGAATGGAGCTTAGAAGCGGCACATGATCGAGTCTTCGTAAATCTAGGCCAGCGTTAAAGTTCAGTGCGTCTCACTGG data/motifs.wtmx 200.0 1 -b /blue/cohn/ja.scott/REforge/REforge/example/background//gc0.49/gc0.49.fa -brief -seq WARNING:root:Error in score_sequence_with_stubb: Command 'stubb_noPseudoCount AGTGGTAGTAGGCTGTCAAGGCAATATGAGGACATTCGGGACCGTTGTCGGACTCCCGCCTGTACGGAGCTCGACTGGAGTTAGGGAGCATCAAGATATAAATTCAGGATCACTCTGAACCTATGAATGGAGCTTAGAAGCGGCACATGATCGAGTCTTCGTAAATCTAGGCCAGCGTTAAAGTTCAGTGCGTCTCACTGG data/motifs.wtmx 200.0 1 -b /blue/cohn/ja.scott/REforge/REforge/example/background//gc0.49/gc0.49.fa -brief -seq' died with <Signals.SIGKILL: 9>.

bjlang commented 2 weeks ago

You're right, this output does not really help. Possibly, if you try running the failing stubb command stubb_noPseudoCount AGTGGTAGTAGGCTGTCAAGGCAATATGAGGACATTCGGGACCGTTGTCGGACTCCCGCCTGTACGGAGCTCGACTGGAGTTAGGGAGCATCAAGATATAAATTCAGGATCACTCTGAACCTATGAATGGAGCTTAGAAGCGGCACATGATCGAGTCTTCGTAAATCTAGGCCAGCGTTAAAGTTCAGTGCGTCTCACTGG data/motifs.wtmx 200.0 1 -b /blue/cohn/ja.scott/REforge/REforge/example/background//gc0.49/gc0.49.fa -brief -seq directly, you get some more hints on why your system is killing the process.

cribe78 commented 2 weeks ago

Hi, I am research facilitator at UF who has been helping Jacob with the installation of this software. I ran a newly built stubb_noPseudoCount (with the patches) and observed it use all 900GB of memory available to it before being killed. There was no output from the command other than "Killed" at the end.

The command I ran was ./stubb_noPseudoCount AGTGGTAGTAGGCTGTCAAGGCAATATGAGGACATTCGGGACCGTTGTCGGACTCCCGCCTGTACGGAGCTCGACTGGAGTTAGGGAGCATCAAGATATAAATTCAGGATCACTCTGAACCTATGAATGGAGCTTAGAAGCGGCACATGATCGAGTCTTCGTAAATCTAGGCCAGCGTTAAAGTTCAGTGCGTCTCACTGG motifs.wtmx 200.0 1 -b gc0.49.fa -brief -seq