faircloth-lab / phyluce

software for UCE (and general) phylogenomics
http://phyluce.readthedocs.org/
Other
76 stars 48 forks source link

Problem with phyluce_align_seqcap_align #211

Closed claudiavaga closed 3 years ago

claudiavaga commented 3 years ago

Hello!

I am having an issue with the alignment (--no-trim option). I am using assemblies from both whole-genome shotgun sequencing and target capture sequencing. The UCE loci captured are more or less 2400. (Everything until the program 'phyluce_assembly_get_fastas_from_match_counts' worked fine)

I am using these settings: phyluce_align_seqcap_align \ --fasta all-taxa-incomplete.fasta \ --output mafft-nexus-internal-trimmed \ --taxa 61 \ --aligner mafft \ --cores 18 \ --incomplete-matrix \ --output-format fasta \ --no-trim \ --log-path /home/geninfo/cvaga/Deltocyathus/taxon-sets/all/log

The script at first seems to work but than takes forever and never ends (after several days it was not done yet). The computer I am using supports that number of cores. I am wondering if there is something wrong in the 'all-taxa-incomplete.fasta' file or I am doing something wrong.

I am trying to attached the fasta file (all-taxa-incomplete.fasta), but even the zipped one is too big for the chat.

Thank you, any help will be really appreciated!!

claudiavaga commented 3 years ago

Hello again!

I ended up splitting the file to be able to send it! I tried multiple times without success to align this and I do not know if there is something wrong in the file or something else is going on.

all-taxa-incomplete.fasta.segmentaa.zip all-taxa-incomplete.fasta.segmentah.zip all-taxa-incomplete.fasta.segmentab.zip all-taxa-incomplete.fasta.segmentac.zip all-taxa-incomplete.fasta.segmentad.zip all-taxa-incomplete.fasta.segmentae.zip all-taxa-incomplete.fasta.segmentaf.zip all-taxa-incomplete.fasta.segmentag.zip

Thank you again!

brantfaircloth commented 3 years ago

Good afternoon. Do the example data from phyluce work? If so, the problem is likely in the fasta data you are trying to align... the program should run pretty quickly with correct data (and if everything is installed correctly).

claudiavaga commented 3 years ago

Good afternoon! Yes, the example data worked perfectly. I was afraid that the problem is in the fasta data but as all the other programs before the alignment ('phyluce_assembly_match_contigs_to_probes', 'phyluce_assembly_get_match_counts' etc.) run perfectly I cannot tell what went wrong... the data were assembled with different softwares as some where retrieved already assembled from different website, could this be a problem?

Thank you for your reply!

brantfaircloth commented 3 years ago

The data look reasonably ok. The numbers associated with UCE loci seems high, but that may be due to your bait naming. Otherwise, I cannot tell what has gone wrong. It would be reasonable to extract a smaller set of fastas for fewer individuals and try with that.

claudiavaga commented 3 years ago

Ok, I will try to do that! Thank you for your help.