faircloth-lab / phyluce

software for UCE (and general) phylogenomics
http://phyluce.readthedocs.org/
Other
80 stars 49 forks source link

phyluce_assembly_match_contigs_to_probes - high volume of UCE loci removed for matching multiple contigs #210

Closed idanaughton closed 3 years ago

idanaughton commented 3 years ago

Hello,

For a couple of my samples, a lot (~1000) of UCE loci are removed for matching multiple contigs during the phyluce_assembly_match_contigs_to_probes step, leaving a low % of unique contigs. I'm unsure what the cause of this may be, and if there is any way to troubleshoot this? I am using velvet to assemble the contigs.

Thank you so much!

brantfaircloth commented 3 years ago

Depends on the organism and bait set, but it’s possible that these samples are (cross-)contaminated. Also possible that the issue is not that bad. First, I would switch to using spades to assemble: Second, I would probably run match_contigs_to_barcodes against the assembled contigs with a mtdna barcode locus - you should only hit one organism/contig matching that locus.

idanaughton commented 3 years ago

Great, thank you so much! And good news - reassembling these samples using spades fixed the problem. The spades assembly actually resulted in more unique contigs for all of the samples that I've reassembled so far (even for the samples that didn't have quite as an alarming number of UCE loci removed for matching multiple contigs). I'm working with ant UCE data, using the hymenoptera-v2-ANT-SPECIFIC probes

thanks again!