faircloth-lab / phyluce

software for UCE (and general) phylogenomics
http://phyluce.readthedocs.org/
Other
76 stars 49 forks source link

How to get joined_allele_sequences_all_samples.fasta #345

Open emmajochim opened 1 month ago

emmajochim commented 1 month ago

Hi Brant!

I am wondering if there is a way to generate a file like the joined_allele_sequences_all_samples.fasta file using the new phasing workflow in 1.7.3. I would like to get alignments of phased loci across individuals. I have tried using cat to put all of the .0.fasta and .1.fasta in one file and then run seqcap_align but the sequences don't have the uce identifiers.

Thanks in advance, Em

emmajochim commented 1 month ago

Additionally, when I tried to run match_contigs_to_probes on the .0.fasta and .1.fastas, I got this error message: Traceback (most recent call last): File "/share/apps/conda/environments/phyluce-1.7.3/bin/phyluce_assembly_match_contigs_to_probes", line 421, in main() File "/share/apps/conda/environments/phyluce-1.7.3/bin/phyluce_assembly_match_contigs_to_probes", line 354, in main contig_name = get_contig_name(lz.name1) File "/share/apps/conda/environments/phyluce-1.7.3/bin/phyluce_assembly_match_contigs_to_probes", line 279, in get_contig_name return match.groups()[0] AttributeError: 'NoneType' object has no attribute 'groups'

brantfaircloth commented 1 month ago

Howdy,

If the contigs that you are mapping reads to with the mapping workflow are the UCE contigs, then they should have the contig identifier in them. So, you might do something like find the uce loci with phyluce_assembly_match_contigs_to_probes, then explode those loci by taxon with phyluce_assembly_explode_get_fastas_file by taxon. Then, the input file for the mapping workflow becomes the taxon specific file of UCE contigs (versus just regular old contigs).

brantfaircloth commented 1 month ago

And, yeah, phyluce_match_contigs_to_probes may not like what comes out of the workflow. Technically, you should not need to run this step again, because you've already IDd the UCE contigs (e.g. see above).

brantfaircloth commented 1 month ago

Just as an aside, I'm not really sure that the phasing workflow offers a whole lot more in the way of informative data than the "regular" workflow.