faircloth-lab / phyluce

software for UCE (and general) phylogenomics
http://phyluce.readthedocs.org/
Other
76 stars 48 forks source link

Phyluce_alignment: All UCE loci dropped while using the --no-trim flag #331

Closed Samyuktha9624 closed 3 months ago

Samyuktha9624 commented 3 months ago

Hello, I have been trying to align DNA sequences with phyluce wrapper phyluce_align_seqcap_align, I notice that all UCE loci are dropped even when using the --no-trim option. I intend to only align and not trim. Below is the code I used and the log file. Any help is greatly appreciated. Thanks in advance!

phyluce_align_seqcap_align \

--input all-taxa-run2-combined.fasta \ --output mafft-nexus-internal-trimmed \ --taxa 51 \ --aligner mafft \ --cores 18 \ --incomplete-matrix \ --output-format fasta \ --no-trim \ --log-path log1

brantfaircloth commented 3 months ago

Have you tried to align a locus or two manually (e.g. just with mafft)? It sounds like something might be going wrong during the alignment process, itself. You can explode your combined fasta by locus (phyluce_align_explode_alignments), then just see if 1-2 loci will align using mafft.

Samyuktha9624 commented 3 months ago

Thank you so much for the quick response! I tried running phyluce_align_explode_alignments which resulted in the following error: ValueError: Sequences must all be the same length

brantfaircloth commented 3 months ago

It sounds like there is something wrong with the alignment file that you are using - have you checked to see if it looks as you might expect?

Samyuktha9624 commented 3 months ago

Thanks for your reply. I checked the files and also rerun the previous steps. The all-taxa-incomplete.fasta file seems to be right. This file has:

uce-1062_Species_name |uce-1062 sequences another-uce-locus_Species-name |uce-locus .....and so on for multiple loci for all species in the dataset

I am quite unsure how to proceed.

brantfaircloth commented 3 months ago

Oops - that was my mistake - try phyluce_assembly_explode_get_fastas_file on your FASTA file. This will split your file by locus, then you can try to align manually to see what happens.

Samyuktha9624 commented 3 months ago

I tried that, and it works I also noticed, that when aligning all files together 480 uce loci are dropped out of 1900 total uce loci. Is this normal? However, I get really low bootstrap values when I proceed to build an ML tree using this alignment.

brantfaircloth commented 3 months ago

It might be normal - it just depends on the reason why the loci are being dropped. Without trimming turned on, either there is an alignment problem for those loci or they contain too few taxa (N < 3).

Samyuktha9624 commented 3 months ago

It says they contain too few taxa (N < 3)