Open vinitamehlawat opened 3 years ago
Hi Vinita,
could you take a look in work_dir/2021-11-17_05/smsao/runs/epa_runs/0/
and see if theres any error message in the file eval.raxml.log
?
Pierre
Hi @Pbdas
I am attaching this eval.raxml.log
for your refrence, please have a look
Thank you very much
Vinita eval.raxml.log
@Pbdas
Maybe we could just add --force
in both raxml launcher functions? (here https://github.com/BenoitMorel/covid19_cme_analysis/blob/master/scripts/raxml_launcher.py)
This thread check is not that important in this context anyway
Yes I agree, only data with ~30k sites will make it through the filters anyway. I just pushed the change to the master branch, so @vinitamehlawat you should be able to update by running (in the folder of the repository) git update
. Let us know if this works.
As for the rootdigger stage, I'm not sure. @computations any idea?
An even better solution would be upgrading to raxml-ng v1.0.x
and using --threads auto
or --threads auto{4}
;)
Hi @Pbdas
Here I am working with my subset data which is consist of 2059 sequences
After git pull
I updated this repo and again ran ./pipeline/5_epa_outgroup_rooting.py work_dir/2021-11-19_00/fmsan/
This time I encounter the following error
Traceback (most recent call last):
File "/home/vinita/covid19_cme_analysis/./pipeline/5_epa_outgroup_rooting.py", line 86, in <module>
hist_csv_file = placement.gappa_examine_lwr( os.path.join( epa_out_dir, "*/*.jplace" ), result_dir )
File "/home/vinita/covid19_cme_analysis/scripts/placement.py", line 60, in gappa_examine_lwr
sub.check_call(cmd)
File "/home/vinita/miniconda3/lib/python3.9/subprocess.py", line 373, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/home/vinita/covid19_cme_analysis/software/gappa/bin/gappa', 'examine', 'lwr', '--jplace-path', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/34/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/55/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/66/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/18/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/42/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/70/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/2/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/26/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/14/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/53/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/52/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/62/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/8/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/57/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/1/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/56/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/21/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/6/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/5/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/41/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/0/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/13/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/67/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/59/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/10/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/16/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/50/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/64/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/29/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/7/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/43/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/71/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/20/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/37/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/38/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/47/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/51/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/3/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/12/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/44/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/23/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/68/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/24/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/31/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/35/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/4/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/49/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/61/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/17/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/39/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/45/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/36/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/33/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/9/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/40/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/11/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/69/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/48/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/25/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/22/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/28/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/19/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/54/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/60/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/27/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/30/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/46/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/58/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/32/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/15/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/65/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/63/epa_result.jplace', '--no-list-file', '--no-compat-check', '--allow-file-overwriting', '--histogram-bins', '20', '--out-dir', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/results/epa_rooting']' returned non-zero exit status 109.
I also checked the work_dir/2021-11-19_00/fmsan/runs/epa_runs/0/
but this time there is NO error in that file and also I have around 71 folders
for my subset data.
I am attaching eval.raxml.log
file for your further refrence, Please have a look at this and let me know how I should Proceed.
eval.raxml.log
Thank you very much! Vinita
Hi Vinita,
the log file from RAxML-NG looks good now! The error seems to be with gappa this time. same procedure as last time: git pull
, and re-run stage 5. Then, under results/epa_rooting/
there should be a file called gappa_examine_lwr.log
that should tell us whats going wrong. (again apologies for the bad error messages)
Pierre
Hi @Pbdas
After git pull
I again ran the 5th stage but unfortunately I don't have gappa_examine_lwr.log
in results/epa_rooting/
but have one .txt file which is outgroup_check.txt
Hi @vinitamehlawat , just letting you know I think I figured out the current issue, and I'm working on a fix
Ok, so I'm 99% sure the issue was that the call to gappa examine lwr
was simply too long for the command line to handle (more than 5k characters) due to the number of trees, and the paths being full, non-relative paths. I've made the paths relative to a working directory now, so that should be sufficient to handle it. Please pull and give it a try!
Also, it could be that the next issue will be related to the visualization using R, meaning that it may be necessary to install some packages. Note however that this visualization is not strictly necessary and you can repeat it later by simply calling the script directly, like so:
scripts/lwr_hist.r work_dir/<correct work dir>/<smsao/smsan/... etc>/results/epa_rooting/lwr_histogram.csv work_dir/<correct work dir>/<smsao/smsan/... etc>/results/epa_rooting/lwr_histogram.pdf
(fill in the correct paths)
the necessary R packages are:
ggplot2
readr
tidyr
dplyr
stringr
Hi @Pbdas
After git pull
again re-run the 5th step and encounter the following error message
Traceback (most recent call last):
File "/home/vinita/covid19_cme_analysis/./pipeline/5_epa_outgroup_rooting.py", line 86, in
This time in fmsan/results/epa_rooting
I have gappa_examine_lwr.log
file, which I am attaching for your further look up. Please have a look at this
Thanks Vinita gappa_examine_lwr.log
Hi Vinita, before I keep making you try fixes, could you tell us what kind of operating system you're using?
Also please run this command in your terminal and paste the result here:
/bin/sh --version
Pierre
Hi @Pbdas I am pasting my terminal output, Please have a look
$SHELL --version
GNU bash, version 5.0.17(1)-release (x86_64-pc-linux-gnu)
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software; you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
/bin/sh =--version
/bin/sh: 0: Illegal option --
Hi @vinitamehlawat,
sorry it took a while, but could finally reproduce the issue on my end!
Heres is what you do:
In the terminal, go to software/gappa
, then run these commands
make clean
git checkout 7398c1cdf5162fe195c9c9fafe999f15e7d5012b
git submodule update --init --recursive
make -j
Now you can try stage 5 again.
Hi @Pbdas
Thank you , I made changes as per your suggestions. This time this shows error for R-packages
like this
Error in library(ggplot2) : there is no package called ‘ggplot2’
Execution halted
Traceback (most recent call last):
File "/home/vinita/covid19_cme_analysis/./pipeline/5_epa_outgroup_rooting.py", line 87, in
If this error is regarding to the only R
then I will make run as you have mentioned eariler on this thread, scripts/lwr_hist.r work_dir/<correct work dir>/<smsao/smsan/... etc>/results/epa_rooting/lwr_histogram.csv work_dir/<correct work dir>/<smsao/smsan/... etc>/results/epa_rooting/lwr_histogram.pdf
But could you please look at the 6th step
error, This time I am also getting the same
./pipeline/6_root_digger_rooting.py work_dir/2021-11-19_00/fmsan/
Traceback (most recent call last):
File "/home/vinita/covid19_cme_analysis/./pipeline/6_root_digger_rooting.py", line 84, in
Thank you very much for your time and effor
Vinita
Dear Vinita,
It looks like I introduced a bug that we haven't detected. It should be fixed now. Please try to run 'git pull' and to start the analysis again. Let us know if that fixes the issue
Best, Benoit
I am little confuse, so after git pull I should run the 5th step again or the whole analysis from step 1
On Thu, 25 Nov 2021 at 5:52 PM, BenoitMorel @.***> wrote:
Dear Vinita,
It looks like I introduced a bug that we haven't detected. It should be fixed now. Please try to run 'git pull' and to start the analysis again. Let us know if that fixes the issue
Best, Benoit
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BenoitMorel/covid19_cme_analysis/issues/2#issuecomment-979167588, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALJQTD3VTHSLIAZ6E3IQ54LUNYTBVANCNFSM5II5W5TA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Sorry, step 5 should be enough.
Le jeu. 25 nov. 2021 à 13:29, vinitamehlawat @.***> a écrit :
I am little confuse, so after git pull I should run the 5th step again or the whole analysis from step 1
On Thu, 25 Nov 2021 at 5:52 PM, BenoitMorel @.***> wrote:
Dear Vinita,
It looks like I introduced a bug that we haven't detected. It should be fixed now. Please try to run 'git pull' and to start the analysis again. Let us know if that fixes the issue
Best, Benoit
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/BenoitMorel/covid19_cme_analysis/issues/2#issuecomment-979167588 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALJQTD3VTHSLIAZ6E3IQ54LUNYTBVANCNFSM5II5W5TA
. Triage notifications on the go with GitHub Mobile for iOS < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/BenoitMorel/covid19_cme_analysis/issues/2#issuecomment-979172299, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJJ3UNQG6MAYBFMIIFZTXTUNYT3RANCNFSM5II5W5TA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
I replied too fast. My fix only fixes step 6. I don't think it depends on step 5. So you should rerun step 6 :-)
Hi @BenoitMorel
After git pull
I re-run the 6th script and following error pop up on my terminal. Please have a look
./pipeline/6_root_digger_rooting.py work_dir/2021-11-19_00/fmsan/
running 8 iterations
['mpiexec', '-np', '48', '/home/vinita/covid19_cme_analysis/software/root_digger/bin/rd', '--tree', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.in.tree', '--msa', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta', '--model', 'GTR+FO+G4', '--exhaustive', '--treefile', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.tree', '--early-stop']
Traceback (most recent call last):
File "/home/vinita/covid19_cme_analysis/./pipeline/6_root_digger_rooting.py", line 64, in
HI @Pbdas
After installing all R -packages
I re-run the 5th script and again got some error, This time I am not sure about this, either bug in your script or it just an error in code. Please have a look (This time I pasted whole output of terminal after running this 5th script)
./pipeline/5_epa_outgroup_rooting.py work_dir/2021-11-19_00/fmsan/
hmmbuild :: profile HMM construction from multiple sequence alignments HMMER 3.3.2 (Nov 2020); http://hmmer.org/ Copyright (C) 2020 Howard Hughes Medical Institute. Freely distributed under the BSD open source license.
input alignment file: /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta output HMM file: reference.hmm number of worker threads: 48
idx name nseq alen mlen W eff_nseq re/pos description
1 covid_edited 1807 27987 27920 29776 1.70 0.619
CPU time: 15.89u 0.32s 00:00:16.21 Elapsed: 00:00:16.20 INFO Splitting files based on reference: /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta WARN The query alignment file '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/hmmer_runs/both.afa' appears to have an alignment width that differs from the reference (29966 vs. 27987). This is likely due to the alignment tool stripping gap-only columns, or adding columns to the reference. Please consider using the produced 'reference.fasta' during placement!` 0 / 71 No protocol specified No protocol specified 1 / 71 No protocol specified No protocol specified . . . . . . 71 / 71 No protocol specified No protocol specified
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
Warning message:
funs()
was deprecated in dplyr 0.8.0.
Please use a list of either functions or lambdas:
Simple named list: list(mean = mean, median = median)
Auto named with tibble::lst()
:
tibble::lst(mean, median)
Using lambdas
list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
This warning is displayed once every 8 hours.
Call lifecycle::last_lifecycle_warnings()
to see where this warning was generated.
Saving 7 x 7 in image
Traceback (most recent call last):
File "/home/vinita/covid19_cme_analysis/./pipeline/5_epa_outgroup_rooting.py", line 95, in
Thank you Vinita
Hi Vinita,
the part that fails now is just a copy of the result files from runs/epa_runs
to results/epa_rooting
, so it's very optional. I wouldn't re-run the script just for that. Just know that the files that are not in epa_rooting
will be in epa_runs
instead. Nevertheless, I just pushed a fix such that it should work correctly next time.
As for the rootdigger error, in the rootdigger_rooting
directory, there should be a file called root_digger.log
, could you share that one? Then @computations will be able to help I think
Cheers, Pierre
Hi @Pbdas
I checked results/rootdigger_rooting
but there is NO root_digger.log
but there is one .csv file root_digger_lwr.csv
which is empty.
Thanks Vinita
Hi Vinita,
that makes debugging a lot harder. One thing I can think of right now is that MPI may not be installed on your machine, could it be? you can check by running mpiexec --version
Hi @Pbdas
(base) mpiexec --version
mpiexec (OpenRTE) 4.0.3
Report bugs to http://www.open-mpi.org/community/help/
it is unfortunate that there is no log file, but there are things to try regardless. If you see a file called something.ckp
, please upload that here, delete it, then rerun step 6.
Ah, I found it (or at least what I think is going on). I pushed a fix to github for rootdigger. You will need to pull the new version and rebuild the program, and then you should be able to run the script successfully.
Hi @computations
Here I am again little confused, Could you please calrify after git pull
only 6th step I should re-run or the whole analysis from ./setup.sh
to each script.
Ah, sorry, I should be more clear.
in the top level directory, there should be a directory software/root_digger
. cd
into that, and run git pull && make -j mpi
. This should update and rebuild RootDigger, which will fix the bug. From there you should be able to rerun just step 6.
Hi @computations
I did the same as you have mentioned above and it successfully updated the software but after re-run of 6th script I am getting same error
./pipeline/6_root_digger_rooting.py work_dir/2021-11-19_00/fmsan/
running 8 iterations
['mpiexec', '-np', '48', '/home/vinita/covid19_cme_analysis/software/root_digger/bin/rd', '--tree', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.in.tree', '--msa', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta', '--model', 'GTR+FO+G4', '--exhaustive', '--treefile', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.tree', '--early-stop']
Traceback (most recent call last):
File "/home/vinita/covid19_cme_analysis/./pipeline/6_root_digger_rooting.py", line 64, in
Alright, I think I managed to find the problem. There was a change in interface for rootdigger that didn't get updated in this pipeline. I have pushed a change to the pipeline, it should be good to just git pull
and run step 6 again.
Hi @computations
I did the git pull
and re-run 6th script But this time also same
running 8 iterations
['mpiexec', '-np', '48', '/home/vinita/covid19_cme_analysis/software/root_digger/bin/rd', '--tree', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.in.tree', '--msa', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta', '--model', 'GTR+FO+G4', '--exhaustive', '--prefix', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0', '--early-stop']
Traceback (most recent call last):
File "/home/vinita/covid19_cme_analysis/./pipeline/6_root_digger_rooting.py", line 66, in
And there are still no logs at this time in results/rootdigger_rooting
? If so, can you just run the command manually like so:
mpiexec -np 48 /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd --tree /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.in.tree --msa /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta --model GTR+FO+G4 --exhaustive --prefix /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0 --early-stop
Hi @computations
Yes, I still don't have log
file in results/rootdigger_rooting
and I ran manually like you mentioned in above thread.
Please have a look at this attached .txt file, this error I just coiped from terminal after running this command
manul_mpiexec_error.txt
Hi @Pbdas
Thank you very much! Now I am able to ran 5th script without any error and got my outputs for this script.
Again thank you for your time and efforts.
Vinita
It looks like the checkpoints might be corrupted. Remove any files in the runs/root_digger_runs
with the .ckp
extension and see if that works.
Hi @computations
I removed 0.cpk
file from runs/root_digger_runs
and re-run the above manual command, Please have a look at attached .txt file
Hi @Pbdas
I apologise for saying that the 5th script worked, but I got outputs for my three datasets fmsan, fmsao, and smsan, but NOT for smsao
. For smsao
data, I received the following error:
(base) ./pipeline/5_epa_outgroup_rooting.py work_dir/2021-11-19_00/smsao/
0 / 71
No protocol specified
No protocol specified
Traceback (most recent call last):
File "/home/vinita/covid19_cme_analysis/./pipeline/5_epa_outgroup_rooting.py", line 74, in
Sorry for bothering you yet again Vinita
@vinitamehlawat what happens when you remove mpiexec -np 48
and add --threads 48
to the end?
@computations terminal now look like this
(base) /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd --tree /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.in.tree --msa /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta --model GTR+FO+G4 --exhaustive --prefix /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0 --threads 48
No protocol specified
No protocol specified
[0.91] [Warning] Running MPI version with only 1 process,
[0.91] [Warning] Loading options from the checkpoint file
terminate called after throwing an instance of 'checkpoint_read_success_failure'
[balaji:3699665] Process received signal
[balaji:3699665] Signal: Aborted (6)
[balaji:3699665] Signal code: (-6)
[balaji:3699665] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7fc0bac513c0]
[balaji:3699665] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7fc0baa9018b]
[balaji:3699665] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7fc0baa6f859]
[balaji:3699665] [ 3] /lib/x86_64-linux-gnu/libstdc++.so.6(+0x9e911)[0x7fc0baeab911]
[balaji:3699665] [ 4] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa38c)[0x7fc0baeb738c]
[balaji:3699665] [ 5] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa3f7)[0x7fc0baeb73f7]
[balaji:3699665] [ 6] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa6a9)[0x7fc0baeb76a9]
[balaji:3699665] [ 7] /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd(_Z17read_with_successI13cli_optionstEmiRT+0x1ac9)[0x55b8c21d88e9]
[balaji:3699665] [ 8] /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd(_ZN12checkpoint_t12load_optionsER13cli_options_t+0x40)[0x55b8c21d3ac0]
[balaji:3699665] [ 9] /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd(_Z24merge_options_checkpointR13cli_options_tR12checkpoint_t+0x80)[0x55b8c21c9970]
[balaji:3699665] [10] /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd(_Z12wrapped_mainiPPc+0xdd)[0x55b8c21cb4dd]
[balaji:3699665] [11] /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd(main+0x2a)[0x55b8c21c818a]
[balaji:3699665] [12] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fc0baa710b3]
[balaji:3699665] [13] /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd(_start+0x2e)[0x55b8c21c891e]
[balaji:3699665] End of error message
Aborted (core dumped)
ok, and now try removing the checkpoint file and see what happens with that same command?
@computations :) it worked
it is still running but not sure how much time it will take to compelet
(base) /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd --tree /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.in.tree --msa /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta --model GTR+FO+G4 --exhaustive --prefix /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0 --threads 48 No protocol specified No protocol specified [0.85] [Warning] Running MPI version with only 1 process, [0.85] Running Root Digger [0.85] Version: v1.7.0-14-g5f23473-mpi [0.85] Build Commit: 5f234738b7e75848d737092a39155565205aa386 [0.85] Build Date: 2021-11-26 23:10:41 [0.85] Started: 2021-11-27 00:20:21 [0.85] Seed: 4028654047 [0.85] Number of threads per proc: 48 [0.85] Number of procs 1 [0.85] Command: /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd --tree /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.in.tree --msa /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta --model GTR+FO+G4 --exhaustive --prefix /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0 --threads 48 [0.85] Please report any bugs to https://groups.google.com/forum/#!forum/raxml [0.85] [Warning] Ignoring subst matrix GTR for model from command line. Currently only UNREST is supported [1.59] Starting exhaustive search [65.11] Step 1 / 3611, ETC: 65.29h [153.64] Step 2 / 3611, ETC: 77.01h [225.30] Step 3 / 3611, ETC: 75.27h [289.90] Step 4 / 3611, ETC: 72.62h [342.85] Step 5 / 3611, ETC: 68.68h . . .
Thanks for being patient. There is an estimated runtime there, and I find it to be approximately accurate.
One thing to note, this is one of the 8 trees that would have been run. I am going to push a change to the script that fixes this so that you can just run the script. But, this will take a while, you have a very large tree.
Hi @Pbdas
I apologise for saying that the 5th script worked, but I got outputs for my three datasets fmsan, fmsao, and smsan, but
NOT for smsao
. Forsmsao
data, I received the following error:(base) ./pipeline/5_epa_outgroup_rooting.py work_dir/2021-11-19_00/smsao/ 0 / 71 No protocol specified No protocol specified Traceback (most recent call last): File "/home/vinita/covid19_cme_analysis/./pipeline/5_epa_outgroup_rooting.py", line 74, in placement.launch_epa( tree_file, cur_modelfile, ref_msa, query_msa, cur_outdir, thorough=True ) File "/home/vinita/covid19_cme_analysis/scripts/placement.py", line 116, in launch_epa sub.check_call(cmd, stdout=sub.DEVNULL) File "/home/vinita/miniconda3/lib/python3.9/subprocess.py", line 373, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['/home/vinita/covid19_cme_analysis/software/epa-ng/bin/epa-ng', '--tree', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/smsao/runs/epa_runs/0/tree.newick', '--model', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/smsao/runs/epa_runs/0/eval.raxml.bestModel', '--msa', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/smsao/data/covid_edited.fasta', '--query', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/smsao/data/covid_outgroups.fasta', '--threads', '48', '--no-heur', '--filter-max', '50', '--filter-acc-lwr', '1.0', '--out-dir', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/smsao/runs/epa_runs/0', '--redo', '--verbose']' returned non-zero exit status 1.
Sorry for bothering you yet again Vinita
Hi Vinita,
there should be a file called epa_info.log
in the runs/epa_runs directory, please share it here.
Hi Vinita,
since you mentioned in #4 that for now you're only interested in getting a tree out, I think we can shelve this error for now. Placement is only relevant here if you want to try to see if it could find a better outgroup/root placement of the tree.
Let me know if in the future you want this kind of analysis, then I'll have another look!
Pierre
Hi @Pbdas
Thank you very much, But could you please suggest which scripts exactly I need to run to get thinned tree for my data and also as you mentioned Placement is enough so is its scripts/placement.py
or ./pipeline/wuhan_placement.py
.
Vinita
Hi Vinita,
I will help you with the thinning. I am updating the wiki page to explain how it should be run, but I need some time to read the code and remember how to use it properly. I will come back to you as soon as possible
Benoit
Hi Vinita,
I'm not sure I understand your question about placement, do you want to use it after all? ´scripts/placement.py´ contains functions that are use by pipeline stages. ´pipeline/wuhan_placement.py´ is a separate placement based stage that tries to place the original Wuhan SARS COV2 genome into the tree. If you're just interested in building a tree, you don't need placement (or rootdigger for that amtter) at all.
Pierre
Hi @Pbdas
So thing is that before using your pipeline I was trying to construct a phylogenetic tree with two different softwares like RaXML and IQ-TREE but I didn't get good branch support in both trees, Then I read your paper which I found extremly helpful and followed your pipeline because you clealry mentioned that how difficult it is to do phylogeney for SARS-data with the low number of mutations in sequences.
So My question is, in your pipeline which scripts are useful for my data (like 0_get_data.py
, 1_preprocess_data.py
2_pargenes.py
...) so that I can only run those specific script on my sars-data to study phylogeny and get a Phylogenetic tree with good branch support for which I currently struggling.
Hope I am able to deliver my question Vinita
Hi @idaios
I prepare my dataset from scratch having high quality 10 sars-cov2 genome with 2 outgruop so total sequences in my data are 12 for which I again ran all script, this time I am stuck at 5th script
When first time ran 5th script the ERROR was
:Then I ran 7th script first which is
7_iqtree_tests.py
and again ran 5th script which is giving following error./pipeline/5_epa_outgroup_rooting.py work_dir/2021-11-17_05/smsao
Further I tried 6th script
./pipeline/6_root_digger_rooting.py work_dir/2021-11-17_05/fmsan/
But somehow 8_tree_thinning.py, 9_mptp_on_all_trees.py, compare_llhs, and extract_thinned_dataset.py worked on 4 dataset which are FMSAO, SMSAO, FMSAN & SMSAN
For
wuhan_placement.py
also get some erro./pipeline/wuhan_placement.py work_dir/2021-11-17_05/smsao/
KIndly help me to understand these issue wether they are interlinked with my data or something which is not present in my data thats why root_digger_lwr.csv is empty in
smsao/results/rootdigger_rooting
It would be very great if you could look at these errors and suggest me how I should solve these.
Thank you very much Vinita