BenoitMorel / covid19_cme_analysis

GNU Affero General Public License v3.0
7 stars 1 forks source link

returned non-zero exit status 255 for 5_epa_outgroup_rooting.py and IndexError: list index out of range for 6_root_digger_rooting.py #2

Open vinitamehlawat opened 2 years ago

vinitamehlawat commented 2 years ago

Hi @idaios

I prepare my dataset from scratch having high quality 10 sars-cov2 genome with 2 outgruop so total sequences in my data are 12 for which I again ran all script, this time I am stuck at 5th script

When first time ran 5th script the ERROR was :

ERROR: Must run iqtree_tests stage of pipeline first
Traceback (most recent call last):
  File "/home/vinita/covid19_cme_analysis/./pipeline/5_epa_outgroup_rooting.py", line 23, in <module>
    raise e
  File "/home/vinita/covid19_cme_analysis/./pipeline/5_epa_outgroup_rooting.py", line 20, in <module>
    util.expect_file_exists( paths.raxml_credible_ml_trees )
  File "/home/vinita/covid19_cme_analysis/scripts/util.py", line 99, in expect_file_exists
    raise RuntimeError( "File doesn't exist: " + file_path )
RuntimeError: File doesn't exist: /home/vinita/covid19_cme_analysis/work_dir/2021-11-17_05/smsao/results/trees/credible_ml_trees.newick

Then I ran 7th script first which is 7_iqtree_tests.py and again ran 5th script which is giving following error

./pipeline/5_epa_outgroup_rooting.py work_dir/2021-11-17_05/smsao

0  /  13
No protocol specified
No protocol specified
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode -1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/vinita/covid19_cme_analysis/./pipeline/5_epa_outgroup_rooting.py", line 71, in <module>
    cur_modelfile = raxml_launcher.evaluate(tree_file, ref_msa, cur_outdir)
  File "/home/vinita/covid19_cme_analysis/scripts/raxml_launcher.py", line 75, in evaluate
    sub.check_call(cmd, cwd=out_dir, stdout=sub.DEVNULL)
  File "/home/vinita/miniconda3/lib/python3.9/subprocess.py", line 373, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/home/vinita/covid19_cme_analysis/software/raxml-ng/bin/raxml-ng-mpi', '--evaluate', '--msa', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-17_05/smsao/data/covid_edited.fasta', '--model', 'GTR+FO+G4', '--tree', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-17_05/smsao/runs/epa_runs/0/tree.newick', '--prefix', 'eval', '--threads', '4', '--blopt', 'nr_safe', '--redo', '--blmin', '0.000000001']' returned non-zero exit status 255.

Further I tried 6th script

./pipeline/6_root_digger_rooting.py work_dir/2021-11-17_05/fmsan/

Traceback (most recent call last):
  File "/home/vinita/covid19_cme_analysis/./pipeline/6_root_digger_rooting.py", line 84, in <module>
    writer = csv.DictWriter(csv_file, fieldnames=results[0].keys())
IndexError: list index out of range
(name_of_my_env) ./pipeline/6_root_digger_rooting.py work_dir/2021-11-17_05/smsao/
Traceback (most recent call last):
  File "/home/vinita/covid19_cme_analysis/./pipeline/6_root_digger_rooting.py", line 84, in <module>
    writer = csv.DictWriter(csv_file, fieldnames=results[0].keys())
IndexError: list index out of range

But somehow 8_tree_thinning.py, 9_mptp_on_all_trees.py, compare_llhs, and extract_thinned_dataset.py worked on 4 dataset which are FMSAO, SMSAO, FMSAN & SMSAN

For wuhan_placement.py also get some erro

./pipeline/wuhan_placement.py work_dir/2021-11-17_05/smsao/

ERROR: Must run placement stage of pipeline first
Traceback (most recent call last):
  File "/home/vinita/covid19_cme_analysis/./pipeline/wuhan_placement.py", line 18, in <module>
    raise e
  File "/home/vinita/covid19_cme_analysis/./pipeline/wuhan_placement.py", line 15, in <module>
    util.expect_dir_exists( paths.epa_rooting_dir )
  File "/home/vinita/covid19_cme_analysis/scripts/util.py", line 95, in expect_dir_exists
    raise RuntimeError( "Directory doesn't exist: " + dir_path )
RuntimeError: Directory doesn't exist: /home/vinita/covid19_cme_analysis/work_dir/2021-11-17_05/smsao/results/epa_rooting

KIndly help me to understand these issue wether they are interlinked with my data or something which is not present in my data thats why root_digger_lwr.csv is empty in smsao/results/rootdigger_rooting

It would be very great if you could look at these errors and suggest me how I should solve these.

Thank you very much Vinita

pierrebarbera commented 2 years ago

Hi Vinita,

could you take a look in work_dir/2021-11-17_05/smsao/runs/epa_runs/0/ and see if theres any error message in the file eval.raxml.log?

Pierre

vinitamehlawat commented 2 years ago

Hi @Pbdas

I am attaching this eval.raxml.log for your refrence, please have a look

Thank you very much

Vinita eval.raxml.log

BenoitMorel commented 2 years ago

@Pbdas

Maybe we could just add --force in both raxml launcher functions? (here https://github.com/BenoitMorel/covid19_cme_analysis/blob/master/scripts/raxml_launcher.py)

This thread check is not that important in this context anyway

pierrebarbera commented 2 years ago

Yes I agree, only data with ~30k sites will make it through the filters anyway. I just pushed the change to the master branch, so @vinitamehlawat you should be able to update by running (in the folder of the repository) git update. Let us know if this works.

As for the rootdigger stage, I'm not sure. @computations any idea?

amkozlov commented 2 years ago

An even better solution would be upgrading to raxml-ng v1.0.x and using --threads auto or --threads auto{4} ;)

vinitamehlawat commented 2 years ago

Hi @Pbdas

Here I am working with my subset data which is consist of 2059 sequences

After git pull I updated this repo and again ran ./pipeline/5_epa_outgroup_rooting.py work_dir/2021-11-19_00/fmsan/

This time I encounter the following error

Traceback (most recent call last):
  File "/home/vinita/covid19_cme_analysis/./pipeline/5_epa_outgroup_rooting.py", line 86, in <module>
    hist_csv_file = placement.gappa_examine_lwr( os.path.join( epa_out_dir, "*/*.jplace" ), result_dir )
  File "/home/vinita/covid19_cme_analysis/scripts/placement.py", line 60, in gappa_examine_lwr
    sub.check_call(cmd)
  File "/home/vinita/miniconda3/lib/python3.9/subprocess.py", line 373, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/home/vinita/covid19_cme_analysis/software/gappa/bin/gappa', 'examine', 'lwr', '--jplace-path', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/34/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/55/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/66/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/18/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/42/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/70/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/2/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/26/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/14/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/53/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/52/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/62/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/8/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/57/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/1/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/56/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/21/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/6/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/5/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/41/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/0/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/13/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/67/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/59/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/10/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/16/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/50/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/64/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/29/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/7/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/43/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/71/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/20/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/37/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/38/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/47/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/51/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/3/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/12/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/44/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/23/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/68/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/24/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/31/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/35/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/4/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/49/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/61/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/17/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/39/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/45/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/36/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/33/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/9/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/40/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/11/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/69/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/48/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/25/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/22/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/28/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/19/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/54/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/60/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/27/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/30/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/46/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/58/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/32/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/15/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/65/epa_result.jplace', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/epa_runs/63/epa_result.jplace', '--no-list-file', '--no-compat-check', '--allow-file-overwriting', '--histogram-bins', '20', '--out-dir', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/results/epa_rooting']' returned non-zero exit status 109.

I also checked the work_dir/2021-11-19_00/fmsan/runs/epa_runs/0/ but this time there is NO error in that file and also I have around 71 folders for my subset data.

I am attaching eval.raxml.log file for your further refrence, Please have a look at this and let me know how I should Proceed. eval.raxml.log

Thank you very much! Vinita

pierrebarbera commented 2 years ago

Hi Vinita,

the log file from RAxML-NG looks good now! The error seems to be with gappa this time. same procedure as last time: git pull, and re-run stage 5. Then, under results/epa_rooting/ there should be a file called gappa_examine_lwr.log that should tell us whats going wrong. (again apologies for the bad error messages)

Pierre

vinitamehlawat commented 2 years ago

Hi @Pbdas

After git pull I again ran the 5th stage but unfortunately I don't have gappa_examine_lwr.log in results/epa_rooting/ but have one .txt file which is outgroup_check.txt

pierrebarbera commented 2 years ago

Hi @vinitamehlawat , just letting you know I think I figured out the current issue, and I'm working on a fix

pierrebarbera commented 2 years ago

Ok, so I'm 99% sure the issue was that the call to gappa examine lwr was simply too long for the command line to handle (more than 5k characters) due to the number of trees, and the paths being full, non-relative paths. I've made the paths relative to a working directory now, so that should be sufficient to handle it. Please pull and give it a try!

Also, it could be that the next issue will be related to the visualization using R, meaning that it may be necessary to install some packages. Note however that this visualization is not strictly necessary and you can repeat it later by simply calling the script directly, like so:

scripts/lwr_hist.r work_dir/<correct work dir>/<smsao/smsan/... etc>/results/epa_rooting/lwr_histogram.csv work_dir/<correct work dir>/<smsao/smsan/... etc>/results/epa_rooting/lwr_histogram.pdf

(fill in the correct paths)

the necessary R packages are:

ggplot2
readr
tidyr
dplyr
stringr
vinitamehlawat commented 2 years ago

Hi @Pbdas

After git pull again re-run the 5th step and encounter the following error message

Traceback (most recent call last): File "/home/vinita/covid19_cme_analysis/./pipeline/5_epa_outgroup_rooting.py", line 86, in hist_csv_file = placement.gappa_examine_lwr( epa_out_dir, result_dir ) File "/home/vinita/covid19_cme_analysis/scripts/placement.py", line 69, in gappa_examine_lwr sub.check_call(cmd, cwd=runs_dir, stdout=logfile) File "/home/vinita/miniconda3/lib/python3.9/subprocess.py", line 373, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['/home/vinita/covid19_cme_analysis/software/gappa/bin/gappa', 'examine', 'lwr', '--jplace-path', '34/epa_result.jplace', '55/epa_result.jplace', '66/epa_result.jplace', '18/epa_result.jplace', '42/epa_result.jplace', '70/epa_result.jplace', '2/epa_result.jplace', '26/epa_result.jplace', '14/epa_result.jplace', '53/epa_result.jplace', '52/epa_result.jplace', '62/epa_result.jplace', '8/epa_result.jplace', '57/epa_result.jplace', '1/epa_result.jplace', '56/epa_result.jplace', '21/epa_result.jplace', '6/epa_result.jplace', '5/epa_result.jplace', '41/epa_result.jplace', '0/epa_result.jplace', '13/epa_result.jplace', '67/epa_result.jplace', '59/epa_result.jplace', '10/epa_result.jplace', '16/epa_result.jplace', '50/epa_result.jplace', '64/epa_result.jplace', '29/epa_result.jplace', '7/epa_result.jplace', '43/epa_result.jplace', '71/epa_result.jplace', '20/epa_result.jplace', '37/epa_result.jplace', '38/epa_result.jplace', '47/epa_result.jplace', '51/epa_result.jplace', '3/epa_result.jplace', '12/epa_result.jplace', '44/epa_result.jplace', '23/epa_result.jplace', '68/epa_result.jplace', '24/epa_result.jplace', '31/epa_result.jplace', '35/epa_result.jplace', '4/epa_result.jplace', '49/epa_result.jplace', '61/epa_result.jplace', '17/epa_result.jplace', '39/epa_result.jplace', '45/epa_result.jplace', '36/epa_result.jplace', '33/epa_result.jplace', '9/epa_result.jplace', '40/epa_result.jplace', '11/epa_result.jplace', '69/epa_result.jplace', '48/epa_result.jplace', '25/epa_result.jplace', '22/epa_result.jplace', '28/epa_result.jplace', '19/epa_result.jplace', '54/epa_result.jplace', '60/epa_result.jplace', '27/epa_result.jplace', '30/epa_result.jplace', '46/epa_result.jplace', '58/epa_result.jplace', '32/epa_result.jplace', '15/epa_result.jplace', '65/epa_result.jplace', '63/epa_result.jplace', '--no-list-file', '--no-compat-check', '--allow-file-overwriting', '--histogram-bins', '20', '--out-dir', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/results/epa_rooting']' returned non-zero exit status 109.

This time in fmsan/results/epa_rooting I have gappa_examine_lwr.log file, which I am attaching for your further look up. Please have a look at this

Thanks Vinita gappa_examine_lwr.log

pierrebarbera commented 2 years ago

Hi Vinita, before I keep making you try fixes, could you tell us what kind of operating system you're using?

Also please run this command in your terminal and paste the result here: /bin/sh --version

Pierre

vinitamehlawat commented 2 years ago

Hi @Pbdas I am pasting my terminal output, Please have a look

$SHELL --version GNU bash, version 5.0.17(1)-release (x86_64-pc-linux-gnu) Copyright (C) 2019 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html

This is free software; you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.

/bin/sh =--version /bin/sh: 0: Illegal option --

pierrebarbera commented 2 years ago

Hi @vinitamehlawat,

sorry it took a while, but could finally reproduce the issue on my end!

Heres is what you do: In the terminal, go to software/gappa, then run these commands

make clean
git checkout 7398c1cdf5162fe195c9c9fafe999f15e7d5012b
git submodule update --init --recursive
make -j

Now you can try stage 5 again.

vinitamehlawat commented 2 years ago

Hi @Pbdas

Thank you , I made changes as per your suggestions. This time this shows error for R-packages like this

Error in library(ggplot2) : there is no package called ‘ggplot2’ Execution halted Traceback (most recent call last): File "/home/vinita/covid19_cme_analysis/./pipeline/5_epa_outgroup_rooting.py", line 87, in placement.ggplot_lwr_histogram( hist_csv_file, result_dir) File "/home/vinita/covid19_cme_analysis/scripts/placement.py", line 84, in ggplot_lwr_histogram sub.check_call(cmd) File "/home/vinita/miniconda3/lib/python3.9/subprocess.py", line 373, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['/home/vinita/covid19_cme_analysis/scripts/lwr_hist.r', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/results/epa_rooting/lwr_histogram.csv', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/results/epa_rooting/lwr_histogram.pdf']' returned non-zero exit status 1.

If this error is regarding to the only R then I will make run as you have mentioned eariler on this thread, scripts/lwr_hist.r work_dir/<correct work dir>/<smsao/smsan/... etc>/results/epa_rooting/lwr_histogram.csv work_dir/<correct work dir>/<smsao/smsan/... etc>/results/epa_rooting/lwr_histogram.pdf

But could you please look at the 6th step error, This time I am also getting the same

./pipeline/6_root_digger_rooting.py work_dir/2021-11-19_00/fmsan/

Traceback (most recent call last): File "/home/vinita/covid19_cme_analysis/./pipeline/6_root_digger_rooting.py", line 84, in writer = csv.DictWriter(csv_file, fieldnames=results[0].keys()) IndexError: list index out of range

Thank you very much for your time and effor

Vinita

BenoitMorel commented 2 years ago

Dear Vinita,

It looks like I introduced a bug that we haven't detected. It should be fixed now. Please try to run 'git pull' and to start the analysis again. Let us know if that fixes the issue

Best, Benoit

vinitamehlawat commented 2 years ago

I am little confuse, so after git pull I should run the 5th step again or the whole analysis from step 1

On Thu, 25 Nov 2021 at 5:52 PM, BenoitMorel @.***> wrote:

Dear Vinita,

It looks like I introduced a bug that we haven't detected. It should be fixed now. Please try to run 'git pull' and to start the analysis again. Let us know if that fixes the issue

Best, Benoit

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BenoitMorel/covid19_cme_analysis/issues/2#issuecomment-979167588, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALJQTD3VTHSLIAZ6E3IQ54LUNYTBVANCNFSM5II5W5TA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

BenoitMorel commented 2 years ago

Sorry, step 5 should be enough.

Le jeu. 25 nov. 2021 à 13:29, vinitamehlawat @.***> a écrit :

I am little confuse, so after git pull I should run the 5th step again or the whole analysis from step 1

On Thu, 25 Nov 2021 at 5:52 PM, BenoitMorel @.***> wrote:

Dear Vinita,

It looks like I introduced a bug that we haven't detected. It should be fixed now. Please try to run 'git pull' and to start the analysis again. Let us know if that fixes the issue

Best, Benoit

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/BenoitMorel/covid19_cme_analysis/issues/2#issuecomment-979167588 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALJQTD3VTHSLIAZ6E3IQ54LUNYTBVANCNFSM5II5W5TA

. Triage notifications on the go with GitHub Mobile for iOS < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android < https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/BenoitMorel/covid19_cme_analysis/issues/2#issuecomment-979172299, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJJ3UNQG6MAYBFMIIFZTXTUNYT3RANCNFSM5II5W5TA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

BenoitMorel commented 2 years ago

I replied too fast. My fix only fixes step 6. I don't think it depends on step 5. So you should rerun step 6 :-)

vinitamehlawat commented 2 years ago

Hi @BenoitMorel

After git pull I re-run the 6th script and following error pop up on my terminal. Please have a look

./pipeline/6_root_digger_rooting.py work_dir/2021-11-19_00/fmsan/ running 8 iterations ['mpiexec', '-np', '48', '/home/vinita/covid19_cme_analysis/software/root_digger/bin/rd', '--tree', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.in.tree', '--msa', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta', '--model', 'GTR+FO+G4', '--exhaustive', '--treefile', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.tree', '--early-stop'] Traceback (most recent call last): File "/home/vinita/covid19_cme_analysis/./pipeline/6_root_digger_rooting.py", line 64, in root_digger_launcher.launch_root_digger(tmp_tree_file, alignment, model, outfile, File "/home/vinita/covid19_cme_analysis/scripts/root_digger_launcher.py", line 21, in launch_root_digger subprocess.check_call(cmd, stdout = outfile, stderr = outfile) File "/home/vinita/miniconda3/envs/name_of_my_env/lib/python3.10/subprocess.py", line 369, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['mpiexec', '-np', '48', '/home/vinita/covid19_cme_analysis/software/root_digger/bin/rd', '--tree', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.in.tree', '--msa', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta', '--model', 'GTR+FO+G4', '--exhaustive', '--treefile', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.tree', '--early-stop']' returned non-zero exit status 132.

vinitamehlawat commented 2 years ago

HI @Pbdas

After installing all R -packages I re-run the 5th script and again got some error, This time I am not sure about this, either bug in your script or it just an error in code. Please have a look (This time I pasted whole output of terminal after running this 5th script)

./pipeline/5_epa_outgroup_rooting.py work_dir/2021-11-19_00/fmsan/

hmmbuild :: profile HMM construction from multiple sequence alignments HMMER 3.3.2 (Nov 2020); http://hmmer.org/ Copyright (C) 2020 Howard Hughes Medical Institute. Freely distributed under the BSD open source license.


input alignment file: /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta output HMM file: reference.hmm number of worker threads: 48


idx name nseq alen mlen W eff_nseq re/pos description


1 covid_edited 1807 27987 27920 29776 1.70 0.619

CPU time: 15.89u 0.32s 00:00:16.21 Elapsed: 00:00:16.20 INFO Splitting files based on reference: /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta WARN The query alignment file '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/hmmer_runs/both.afa' appears to have an alignment width that differs from the reference (29966 vs. 27987). This is likely due to the alignment tool stripping gap-only columns, or adding columns to the reference. Please consider using the produced 'reference.fasta' during placement!` 0 / 71 No protocol specified No protocol specified 1 / 71 No protocol specified No protocol specified . . . . . . 71 / 71 No protocol specified No protocol specified

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

filter, lag

The following objects are masked from ‘package:base’:

intersect, setdiff, setequal, union

Warning message: funs() was deprecated in dplyr 0.8.0. Please use a list of either functions or lambdas:

Simple named list: list(mean = mean, median = median)

Auto named with tibble::lst(): tibble::lst(mean, median)

Using lambdas list(~ mean(., trim = .2), ~ median(., na.rm = TRUE)) This warning is displayed once every 8 hours. Call lifecycle::last_lifecycle_warnings() to see where this warning was generated. Saving 7 x 7 in image Traceback (most recent call last): File "/home/vinita/covid19_cme_analysis/./pipeline/5_epa_outgroup_rooting.py", line 95, in util.copy_dir( d, os.path.join( result_dir, os.path.basename(d) ), [".rba", ".phy", ".startTree"] ) File "/home/vinita/covid19_cme_analysis/scripts/util.py", line 51, in copy_dir ign_f = shutil.ignore_patterns(ignore) NameError: name 'shutil' is not defined

Thank you Vinita

pierrebarbera commented 2 years ago

Hi Vinita,

the part that fails now is just a copy of the result files from runs/epa_runs to results/epa_rooting, so it's very optional. I wouldn't re-run the script just for that. Just know that the files that are not in epa_rooting will be in epa_runs instead. Nevertheless, I just pushed a fix such that it should work correctly next time.

As for the rootdigger error, in the rootdigger_rooting directory, there should be a file called root_digger.log, could you share that one? Then @computations will be able to help I think

Cheers, Pierre

vinitamehlawat commented 2 years ago

Hi @Pbdas

I checked results/rootdigger_rooting but there is NO root_digger.log but there is one .csv file root_digger_lwr.csv which is empty.

Thanks Vinita

pierrebarbera commented 2 years ago

Hi Vinita,

that makes debugging a lot harder. One thing I can think of right now is that MPI may not be installed on your machine, could it be? you can check by running mpiexec --version

vinitamehlawat commented 2 years ago

Hi @Pbdas

(base) mpiexec --version mpiexec (OpenRTE) 4.0.3

Report bugs to http://www.open-mpi.org/community/help/

computations commented 2 years ago

it is unfortunate that there is no log file, but there are things to try regardless. If you see a file called something.ckp, please upload that here, delete it, then rerun step 6.

computations commented 2 years ago

Ah, I found it (or at least what I think is going on). I pushed a fix to github for rootdigger. You will need to pull the new version and rebuild the program, and then you should be able to run the script successfully.

vinitamehlawat commented 2 years ago

Hi @computations

Here I am again little confused, Could you please calrify after git pull only 6th step I should re-run or the whole analysis from ./setup.sh to each script.

computations commented 2 years ago

Ah, sorry, I should be more clear.

in the top level directory, there should be a directory software/root_digger. cd into that, and run git pull && make -j mpi. This should update and rebuild RootDigger, which will fix the bug. From there you should be able to rerun just step 6.

vinitamehlawat commented 2 years ago

Hi @computations

I did the same as you have mentioned above and it successfully updated the software but after re-run of 6th script I am getting same error

./pipeline/6_root_digger_rooting.py work_dir/2021-11-19_00/fmsan/ running 8 iterations ['mpiexec', '-np', '48', '/home/vinita/covid19_cme_analysis/software/root_digger/bin/rd', '--tree', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.in.tree', '--msa', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta', '--model', 'GTR+FO+G4', '--exhaustive', '--treefile', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.tree', '--early-stop'] Traceback (most recent call last): File "/home/vinita/covid19_cme_analysis/./pipeline/6_root_digger_rooting.py", line 64, in root_digger_launcher.launch_root_digger(tmp_tree_file, alignment, model, outfile, File "/home/vinita/covid19_cme_analysis/scripts/root_digger_launcher.py", line 21, in launch_root_digger subprocess.check_call(cmd, stdout = outfile, stderr = outfile) File "/home/vinita/miniconda3/envs/name_of_my_env/lib/python3.10/subprocess.py", line 369, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['mpiexec', '-np', '48', '/home/vinita/covid19_cme_analysis/software/root_digger/bin/rd', '--tree', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.in.tree', '--msa', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta', '--model', 'GTR+FO+G4', '--exhaustive', '--treefile', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.tree', '--early-stop']' returned non-zero exit status 1.

computations commented 2 years ago

Alright, I think I managed to find the problem. There was a change in interface for rootdigger that didn't get updated in this pipeline. I have pushed a change to the pipeline, it should be good to just git pull and run step 6 again.

vinitamehlawat commented 2 years ago

Hi @computations

I did the git pull and re-run 6th script But this time also same

running 8 iterations ['mpiexec', '-np', '48', '/home/vinita/covid19_cme_analysis/software/root_digger/bin/rd', '--tree', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.in.tree', '--msa', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta', '--model', 'GTR+FO+G4', '--exhaustive', '--prefix', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0', '--early-stop'] Traceback (most recent call last): File "/home/vinita/covid19_cme_analysis/./pipeline/6_root_digger_rooting.py", line 66, in root_digger_launcher.launch_root_digger(tmp_tree_file, alignment, model, File "/home/vinita/covid19_cme_analysis/scripts/root_digger_launcher.py", line 21, in launch_root_digger subprocess.check_call(cmd, stdout = outfile, stderr = outfile) File "/home/vinita/miniconda3/envs/name_of_my_env/lib/python3.10/subprocess.py", line 369, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['mpiexec', '-np', '48', '/home/vinita/covid19_cme_analysis/software/root_digger/bin/rd', '--tree', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.in.tree', '--msa', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta', '--model', 'GTR+FO+G4', '--exhaustive', '--prefix', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0', '--early-stop']' returned non-zero exit status 134.

computations commented 2 years ago

And there are still no logs at this time in results/rootdigger_rooting? If so, can you just run the command manually like so:

mpiexec -np 48 /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd --tree /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.in.tree --msa /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta --model GTR+FO+G4 --exhaustive --prefix /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0 --early-stop
vinitamehlawat commented 2 years ago

Hi @computations

Yes, I still don't have log file in results/rootdigger_rooting and I ran manually like you mentioned in above thread. Please have a look at this attached .txt file, this error I just coiped from terminal after running this command manul_mpiexec_error.txt

vinitamehlawat commented 2 years ago

Hi @Pbdas

Thank you very much! Now I am able to ran 5th script without any error and got my outputs for this script.

Again thank you for your time and efforts.

Vinita

computations commented 2 years ago

It looks like the checkpoints might be corrupted. Remove any files in the runs/root_digger_runs with the .ckp extension and see if that works.

vinitamehlawat commented 2 years ago

Hi @computations

I removed 0.cpk file from runs/root_digger_runs and re-run the above manual command, Please have a look at attached .txt file

manual_mpiexec_error2.txt

vinitamehlawat commented 2 years ago

Hi @Pbdas

I apologise for saying that the 5th script worked, but I got outputs for my three datasets fmsan, fmsao, and smsan, but NOT for smsao. For smsao data, I received the following error:

(base) ./pipeline/5_epa_outgroup_rooting.py work_dir/2021-11-19_00/smsao/ 0 / 71 No protocol specified No protocol specified Traceback (most recent call last): File "/home/vinita/covid19_cme_analysis/./pipeline/5_epa_outgroup_rooting.py", line 74, in placement.launch_epa( tree_file, cur_modelfile, ref_msa, query_msa, cur_outdir, thorough=True ) File "/home/vinita/covid19_cme_analysis/scripts/placement.py", line 116, in launch_epa sub.check_call(cmd, stdout=sub.DEVNULL) File "/home/vinita/miniconda3/lib/python3.9/subprocess.py", line 373, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['/home/vinita/covid19_cme_analysis/software/epa-ng/bin/epa-ng', '--tree', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/smsao/runs/epa_runs/0/tree.newick', '--model', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/smsao/runs/epa_runs/0/eval.raxml.bestModel', '--msa', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/smsao/data/covid_edited.fasta', '--query', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/smsao/data/covid_outgroups.fasta', '--threads', '48', '--no-heur', '--filter-max', '50', '--filter-acc-lwr', '1.0', '--out-dir', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/smsao/runs/epa_runs/0', '--redo', '--verbose']' returned non-zero exit status 1.

Sorry for bothering you yet again Vinita

computations commented 2 years ago

@vinitamehlawat what happens when you remove mpiexec -np 48 and add --threads 48 to the end?

vinitamehlawat commented 2 years ago

@computations terminal now look like this

(base) /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd --tree /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.in.tree --msa /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta --model GTR+FO+G4 --exhaustive --prefix /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0 --threads 48 No protocol specified No protocol specified [0.91] [Warning] Running MPI version with only 1 process, [0.91] [Warning] Loading options from the checkpoint file terminate called after throwing an instance of 'checkpoint_read_success_failure' [balaji:3699665] Process received signal [balaji:3699665] Signal: Aborted (6) [balaji:3699665] Signal code: (-6) [balaji:3699665] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7fc0bac513c0] [balaji:3699665] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7fc0baa9018b] [balaji:3699665] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7fc0baa6f859] [balaji:3699665] [ 3] /lib/x86_64-linux-gnu/libstdc++.so.6(+0x9e911)[0x7fc0baeab911] [balaji:3699665] [ 4] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa38c)[0x7fc0baeb738c] [balaji:3699665] [ 5] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa3f7)[0x7fc0baeb73f7] [balaji:3699665] [ 6] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa6a9)[0x7fc0baeb76a9] [balaji:3699665] [ 7] /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd(_Z17read_with_successI13cli_optionstEmiRT+0x1ac9)[0x55b8c21d88e9] [balaji:3699665] [ 8] /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd(_ZN12checkpoint_t12load_optionsER13cli_options_t+0x40)[0x55b8c21d3ac0] [balaji:3699665] [ 9] /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd(_Z24merge_options_checkpointR13cli_options_tR12checkpoint_t+0x80)[0x55b8c21c9970] [balaji:3699665] [10] /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd(_Z12wrapped_mainiPPc+0xdd)[0x55b8c21cb4dd] [balaji:3699665] [11] /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd(main+0x2a)[0x55b8c21c818a] [balaji:3699665] [12] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fc0baa710b3] [balaji:3699665] [13] /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd(_start+0x2e)[0x55b8c21c891e] [balaji:3699665] End of error message Aborted (core dumped)

computations commented 2 years ago

ok, and now try removing the checkpoint file and see what happens with that same command?

vinitamehlawat commented 2 years ago

@computations :) it worked

it is still running but not sure how much time it will take to compelet

(base) /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd --tree /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.in.tree --msa /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta --model GTR+FO+G4 --exhaustive --prefix /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0 --threads 48 No protocol specified No protocol specified [0.85] [Warning] Running MPI version with only 1 process, [0.85] Running Root Digger [0.85] Version: v1.7.0-14-g5f23473-mpi [0.85] Build Commit: 5f234738b7e75848d737092a39155565205aa386 [0.85] Build Date: 2021-11-26 23:10:41 [0.85] Started: 2021-11-27 00:20:21 [0.85] Seed: 4028654047 [0.85] Number of threads per proc: 48 [0.85] Number of procs 1 [0.85] Command: /home/vinita/covid19_cme_analysis/software/root_digger/bin/rd --tree /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0.in.tree --msa /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/data/covid_edited.fasta --model GTR+FO+G4 --exhaustive --prefix /home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/fmsan/runs/root_digger_runs/0 --threads 48 [0.85] Please report any bugs to https://groups.google.com/forum/#!forum/raxml [0.85] [Warning] Ignoring subst matrix GTR for model from command line. Currently only UNREST is supported [1.59] Starting exhaustive search [65.11] Step 1 / 3611, ETC: 65.29h [153.64] Step 2 / 3611, ETC: 77.01h [225.30] Step 3 / 3611, ETC: 75.27h [289.90] Step 4 / 3611, ETC: 72.62h [342.85] Step 5 / 3611, ETC: 68.68h . . .

computations commented 2 years ago

Thanks for being patient. There is an estimated runtime there, and I find it to be approximately accurate.

One thing to note, this is one of the 8 trees that would have been run. I am going to push a change to the script that fixes this so that you can just run the script. But, this will take a while, you have a very large tree.

pierrebarbera commented 2 years ago

Hi @Pbdas

I apologise for saying that the 5th script worked, but I got outputs for my three datasets fmsan, fmsao, and smsan, but NOT for smsao. For smsao data, I received the following error:

(base) ./pipeline/5_epa_outgroup_rooting.py work_dir/2021-11-19_00/smsao/ 0 / 71 No protocol specified No protocol specified Traceback (most recent call last): File "/home/vinita/covid19_cme_analysis/./pipeline/5_epa_outgroup_rooting.py", line 74, in placement.launch_epa( tree_file, cur_modelfile, ref_msa, query_msa, cur_outdir, thorough=True ) File "/home/vinita/covid19_cme_analysis/scripts/placement.py", line 116, in launch_epa sub.check_call(cmd, stdout=sub.DEVNULL) File "/home/vinita/miniconda3/lib/python3.9/subprocess.py", line 373, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['/home/vinita/covid19_cme_analysis/software/epa-ng/bin/epa-ng', '--tree', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/smsao/runs/epa_runs/0/tree.newick', '--model', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/smsao/runs/epa_runs/0/eval.raxml.bestModel', '--msa', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/smsao/data/covid_edited.fasta', '--query', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/smsao/data/covid_outgroups.fasta', '--threads', '48', '--no-heur', '--filter-max', '50', '--filter-acc-lwr', '1.0', '--out-dir', '/home/vinita/covid19_cme_analysis/work_dir/2021-11-19_00/smsao/runs/epa_runs/0', '--redo', '--verbose']' returned non-zero exit status 1.

Sorry for bothering you yet again Vinita

Hi Vinita,

there should be a file called epa_info.log in the runs/epa_runs directory, please share it here.

vinitamehlawat commented 2 years ago

Hi @Pbdas

Please find attached epa_info.log for smsao data

epa_info.log

Thank you Vinita

pierrebarbera commented 2 years ago

Hi Vinita,

since you mentioned in #4 that for now you're only interested in getting a tree out, I think we can shelve this error for now. Placement is only relevant here if you want to try to see if it could find a better outgroup/root placement of the tree.

Let me know if in the future you want this kind of analysis, then I'll have another look!

Pierre

vinitamehlawat commented 2 years ago

Hi @Pbdas

Thank you very much, But could you please suggest which scripts exactly I need to run to get thinned tree for my data and also as you mentioned Placement is enough so is its scripts/placement.py or ./pipeline/wuhan_placement.py .

Vinita

BenoitMorel commented 2 years ago

Hi Vinita,

I will help you with the thinning. I am updating the wiki page to explain how it should be run, but I need some time to read the code and remember how to use it properly. I will come back to you as soon as possible

Benoit

pierrebarbera commented 2 years ago

Hi Vinita,

I'm not sure I understand your question about placement, do you want to use it after all? ´scripts/placement.py´ contains functions that are use by pipeline stages. ´pipeline/wuhan_placement.py´ is a separate placement based stage that tries to place the original Wuhan SARS COV2 genome into the tree. If you're just interested in building a tree, you don't need placement (or rootdigger for that amtter) at all.

Pierre

vinitamehlawat commented 2 years ago

Hi @Pbdas

So thing is that before using your pipeline I was trying to construct a phylogenetic tree with two different softwares like RaXML and IQ-TREE but I didn't get good branch support in both trees, Then I read your paper which I found extremly helpful and followed your pipeline because you clealry mentioned that how difficult it is to do phylogeney for SARS-data with the low number of mutations in sequences.

So My question is, in your pipeline which scripts are useful for my data (like 0_get_data.py, 1_preprocess_data.py 2_pargenes.py ...) so that I can only run those specific script on my sars-data to study phylogeny and get a Phylogenetic tree with good branch support for which I currently struggling.

Hope I am able to deliver my question Vinita