cambiotraining / bacterial-genomics

Course materials for "Working with Bacterial Genomes"
https://cambiotraining.github.io/bacterial-genomics/
Other
0 stars 2 forks source link

IQ-Tree error when running 04-mask_pseudogenome.sh #10

Closed bsalehe closed 10 months ago

bsalehe commented 10 months ago

Hi @tavareshugo and @avantonder. I am continuing to test the pipeline/scripts as participant with M_tuberculosis isolate. I am running script 05-run_iqtree.sh. Unfortunately. I keep getting this weird error:-

terminate called after throwing an instance of 'std::cxx11::basic_string<char, std::char_traits, std::allocator >' ERROR: STACK TRACE FOR DEBUGGING: ERROR: 1 funcAbort() ERROR: 2 () ERROR: 3 gsignal() ERROR: 4 abort() ERROR: 5 gnu_cxx::verbose_terminate_handler() ERROR: 6 () ERROR: 7 () ERROR: 8 cxa_rethrow() ERROR: 9 () ERROR: 10 Alignment::addConstPatterns(char) ERROR: 11 runPhyloAnalysis(Params&, Checkpoint, IQTree&, Alignment&) ERROR: 12 runPhyloAnalysis(Params&, Checkpoint*) ERROR: 13 main() ERROR: 14 __libc_start_main() ERROR: 15 () ERROR: ERROR: IQ-TREE CRASHES WITH SIGNAL ABORTED ERROR: For bug report please send to developers: ERROR: Log file: results/iqtree/Nam_TB.log ERROR: Alignment files (if possible) scripts/05-run_iqtree.sh: line 26: 68670 Aborted (core dumped) iqtree -fconst results/snp-sites/constant_sites.txt -s results/snp-sites/aligned_pseudogenomes_masked_snps.fas --prefix results/iqtree/Nam_TB -nt AUTO -ntmax 8 -mem 8G -m GTR+F+I -bb 1000

I wonder if any you experienced this error. Any idea what is wrong with inputs or command. Below is the script I modified after successfully running the 04-mask_pseudogenome.sh script.

#!/bin/bash

#mamba deactivate
#mamba actiavte iqtree

# create output directory
mkdir -p results/snp-sites/
mkdir -p results/iqtree/

# extract variable sites
snp-sites results/bactmap/masked_alignment/aligned_pseudogenomes_masked.fas > results/snp-sites/aligned_pseudogenomes_masked_snps.fas

# count invariant sites
snp-sites -C results/bactmap/masked_alignment/aligned_pseudogenomes_masked.fas > results/snp-sites/constant_sites.txt

# FIX!!
# Run iqtree
iqtree \
  -fconst results/snp-sites/constant_sites.txt \
  -s results/snp-sites/aligned_pseudogenomes_masked_snps.fas \
  --prefix results/iqtree/Nam_TB \
  -nt AUTO \
  -ntmax 8 \
  -mem 8G \
  -m GTR+F+I \
  -bb 1000

Cheers!

avantonder commented 10 months ago

Two thoughts:

Is there anything in the constant_sites.txt file?

Is there sufficient memory on the machines?

bsalehe commented 10 months ago

Two thoughts:

Is there anything in the constant_sites.txt file?

Is there sufficient memory on the machines?

Yes, with cat results/snp-sites/constant_sites.txt I get the following output

692542,1311546,1307543,691941

avantonder commented 10 months ago

Hmm, does it at least read in the alignment before crashing?

tavareshugo commented 10 months ago

It could be a memory issue. We're currently running a course in the room, so these machines are potentially quite stretched. You could try running free -h to see how much free memory there is at the moment. For example, on my test instance I only have 7G free at the moment.

avantonder commented 10 months ago

@bsalehe The potential cause of the problem occurred to me on my drive home. The -fconst flag expects the contents of constant_sites.txt not the file so two ways to edit the script:

-fconst 692542,1311546,1307543,691941 (copy and paste contents of constant_sites.txt)

-fconst $(cat results/snp-sites/constant_sites.txt) (directly input contents of constant_sites.txt)

bsalehe commented 10 months ago

This -fconst $(cat results/snp-sites/constant_sites.txt) did the job. So, do we need to update the script for participants?

Thanks both!

avantonder commented 10 months ago

In the answer to the exercise I've used -fconst $(cat results/snp-sites/constant_sites.txt). Perhaps I should change the example code above in the course materials so it's not 692240,1310839,1306835,691662

tavareshugo commented 10 months ago

I think using the numbers might be a bit easier. As I revise the materials, I can adjust the exercise to be more explicit that they are supposed to copy the contents of the file into the command, not use the path to the file.

tavareshugo commented 10 months ago

Humm... I think we will need to include $(cat results/snp-sites/constant_sites.txt) after all, because the script itself generates the file, so there's no in-between step where they calculate these values and look at them.

Maybe instead I can include a couple of FIX_INPUT_FASTA_FILE for the snp-sites input files. So, at least they can think about what should be used as input at these steps.

bsalehe commented 10 months ago

Humm... I think we will need to include $(cat results/snp-sites/constant_sites.txt) after all, because the script itself generates the file, so there's no in-between step where they calculate these values and look at them.

Maybe instead I can include a couple of FIX_INPUT_FASTA_FILE for the snp-sites input files. So, at least they can think about what should be used as input at these steps.

I included $(cat results/snp-sites/constant_sites.txt) when I was running script for S_aureus. I agree.