Closed heather340 closed 5 years ago
I'm not sure what the issue is... both of these trim correctly for me. I took your alignments, renamed them to end with .nexus
, and trimmed them with:
phyluce_align_get_trimal_trimmed_alignments_from_untrimmed --alignments alignments --output trimmed --input-format nexus --output-format nexus
This produced the two outputs:
Thanks for the fast response! I just took your code (I didn't realize there was a phyluce_align_get_trimal_trimmed_alignments_from_untrimmed) and it looks like it worked beautifully. So, I'll just do it in 2 steps from now on if necessary.
Thanks again for the assistance.
Cool.
Hi again, I've been trying to do the alignment step after phasing in the tutorial with this dataset, and am running into issues again. I noticed that the phasing step may not work with data that has been trimmed with trimal - would this be an issue?
My files from the multialign-phased step have files have N's in them; only 100 @ 50% are aligned with the no-trim option as the loci are dropped. If I replace these N's with -, remove them entirely (but gaps are still present..), or run the alignment step as ignoring ambiguous characters, then I get an error of "No records found in handle". I'm happy to email you a file if needed.
Info running last run:
019-05-01 10:04:13,110 - phyluce_align_seqcap_align - INFO - ============== Starting phyluce_align_seqcap_align ==============
2019-05-01 10:04:13,110 - phyluce_align_seqcap_align - INFO - Version: git fatal: Not a git repository: '/users/PAS1390/osu10232/en
vs/phyluce/lib/python2.7/site-packages/.git'
2019-05-01 10:04:13,110 - phyluce_align_seqcap_align - INFO - Argument --aligner: mafft
2019-05-01 10:04:13,110 - phyluce_align_seqcap_align - INFO - Argument --ambiguous: False
2019-05-01 10:04:13,111 - phyluce_align_seqcap_align - INFO - Argument --cores: 12
2019-05-01 10:04:13,111 - phyluce_align_seqcap_align - INFO - Argument --fasta: /users/PAS1390/osu10232/UCE/taxon-sets/ingroup/phas
ing_step/multialign-bams-phased-reads-ingroup3/fastas/joined_allele_sequences_all_samples_removed.fasta
2019-05-01 10:04:13,111 - phyluce_align_seqcap_align - INFO - Argument --log_path: /users/PAS1390/osu10232/UCE/taxon-sets/ingroup/p
hasing_step/log
2019-05-01 10:04:13,111 - phyluce_align_seqcap_align - INFO - Argument --max_divergence: 0.2
2019-05-01 10:04:13,111 - phyluce_align_seqcap_align - INFO - Argument --min_length: 100
2019-05-01 10:04:13,111 - phyluce_align_seqcap_align - INFO - Argument --no_trim: True
2019-05-01 10:04:13,112 - phyluce_align_seqcap_align - INFO - Argument --notstrict: True
2019-05-01 10:04:13,112 - phyluce_align_seqcap_align - INFO - Argument --output: /users/PAS1390/osu10232/UCE/taxon-sets/ingroup/pha
sing_step/PHASED-DATA-mafft-nexus-aligned-removed-notrim-ingroup3
2019-05-01 10:04:13,112 - phyluce_align_seqcap_align - INFO - Argument --output_format: nexus
2019-05-01 10:04:13,112 - phyluce_align_seqcap_align - INFO - Argument --proportion: 0.65
2019-05-01 10:04:13,112 - phyluce_align_seqcap_align - INFO - Argument --taxa: 28
2019-05-01 10:04:13,112 - phyluce_align_seqcap_align - INFO - Argument --threshold: 0.65
2019-05-01 10:04:13,112 - phyluce_align_seqcap_align - INFO - Argument --verbosity: INFO
2019-05-01 10:04:13,113 - phyluce_align_seqcap_align - INFO - Argument --window: 20
2019-05-01 10:04:13,113 - phyluce_align_seqcap_align - INFO - Building the locus dictionary
2019-05-01 10:04:13,113 - phyluce_align_seqcap_align - INFO - Removing ALL sequences with ambiguous bases...
2019-05-01 10:04:17,533 - phyluce_align_seqcap_align - WARNING - DROPPED locus uce-111831. Too few taxa (N < 3).
2019-05-01 09:55:54,824 - phyluce_align_seqcap_align - INFO - Aligning with MAFFT
2019-05-01 09:55:54,827 - phyluce_align_seqcap_align - INFO - Alignment begins. 'X' indicates dropped alignments (these are reporte
d after alignment)
...................................................................................................................................
...................................................................................................................................
...................................................................................................................................
...................................................................................................................................
...................................................................................................................................
.................................................................................Traceback (most recent call last):
File "/users/PAS1390/osu10232/envs/phyluce/bin/phyluce_align_seqcap_align", line 255, in
I haven't tried to phase against alignments trimmed with trimal. I suspect this won't work very well.
Ah ok. I suppose then that brings me back to my first issue then of not being able to edge trim the alignment prior to moving onto the phasing step.
Here's one of the Fasta files:
I'm not sure why your alignments are being trimmed to the degree that they are. you can run the edge trimming after alignment using phyluce_align_get_trimmed_alignments_from_untrimmed
and perhaps that will help you diagnose the issue.
That one dropped all the loci as well. Is there another setting you'd recommend I play with under the alignment settings? Or is there a way to get around that for the Phasing step? I guess I'm not familiar with why the regular edge trim would work fine but trimal would not.
I tried to remove all -, N, and gaps from the multialign phased fasta file just before the final alignment just in case it would work better with raw, unaligned alleles, but the file just comes out messy with uneven lines throughout. If there is a way to straighten that file out, do you think it could be aligned with no-trim and subsequently trimal?
Thanks!
Hi, same problem with the internal trimming step: NO RECORDS FOUND IN HANDLE.... Any solution at this date?
Hello @Ofsm and @heather340 after one year I expect that it's worked for you. But if someone has been the same issue, I recommend seeing Mr. Brown's tips on this tutorial: (https://github.com/jasonleebrown/UCE_phyluce_pipeline). I tried the same procedures and it's working for me.
Hello, I'm having an issue with aligning and trimming my loci for 16 taxa. When just aligning (no-trim), there are no issues besides having too few taxa for some of the loci (N<3). However, when I turn trimming on, then all loci are dropped. I've tried adjusting the window, threshold, and proportion with no luck. I'm not sure if it's due to the large amount of missing data on both ends of the alignment; does using the sliding window remove everything towards the 3' (or 5') end after it encounters a window that does not meet the requirements? We would like to build a SNP dataset, so need the edge trimming before proceeding to the next step.
I have attached the log file, two example fasta file for an aligned, no-trim locus, and a screenshot of the summary stats for the --no-trim alignment.
Thanks in advance for your advice!
phyluce_align_seqcap_align.log
uce-2185.txt
uce-2054.txt