I'm working with a set of song sparrow genomes, ultimately attempting to use UCD loci to create a tree using IQtree. I made my way through your Tutorial III: Harvesting UCE Loci From Genomes , and understand that further processing with the methods in Tutorial I: UCE Phylogenomics, should occur before I feed the resulting .fasta files into IQtree.
However, I'm having a bit of trouble conceptually with the section below:
_Using the extracted sequences in downstream analyses
The easiest way for you to use the extracted sequences is to symlink them into an appropriate contigs folder that resulted from a PHYLUCE assembly process and then proceed with the Extracting UCE loci procedure.
For more information on the structure of this folder, look at the Assemble the data section of Tutorial I: UCE Phylogenomics for more information._
In the linked Assemble the Data section of Tutorial I, it looks like the symlinks contained in the contigs file are pointing towards different files ("contigs.fasta") from the assembly process. Are we supposed to run an assembly on the UCE loci we got from the whole genome files and symlink the resulting contigs.fasta files, or just plunk them directly into the contigs directory? I copy/pasted a relevant portion of the example directory structure you posted in the tutorial below to illustrate.
I'm working with a set of song sparrow genomes, ultimately attempting to use UCD loci to create a tree using IQtree. I made my way through your Tutorial III: Harvesting UCE Loci From Genomes , and understand that further processing with the methods in Tutorial I: UCE Phylogenomics, should occur before I feed the resulting .fasta files into IQtree.
However, I'm having a bit of trouble conceptually with the section below:
_Using the extracted sequences in downstream analyses
The easiest way for you to use the extracted sequences is to symlink them into an appropriate contigs folder that resulted from a PHYLUCE assembly process and then proceed with the Extracting UCE loci procedure.
For more information on the structure of this folder, look at the Assemble the data section of Tutorial I: UCE Phylogenomics for more information._
In the linked Assemble the Data section of Tutorial I, it looks like the symlinks contained in the contigs file are pointing towards different files ("contigs.fasta") from the assembly process. Are we supposed to run an assembly on the UCE loci we got from the whole genome files and symlink the resulting contigs.fasta files, or just plunk them directly into the contigs directory? I copy/pasted a relevant portion of the example directory structure you posted in the tutorial below to illustrate.
Thanks in advance for your time.
├── raw-fastq └── spades-assemblies ├── alligator_mississippiensis_trinity │ ├── contigs.fasta │ ├── scaffolds.fasta │ └── spades.log ├── contigs │ ├── alligator_mississippiensis.contigs.fasta -> ../alligator_mississippiensis_trinity/contigs.fasta │ ├── anolis_carolinensis.contigs.fasta -> ../anolis_carolinensis_trinity/contigs.fasta │ ├── gallus_gallus.contigs.fasta -> ../gallus_gallus_trinity/contigs.fasta │ └── mus_musculus.contigs.fasta -> ../mus_musculus_trinity/contigs.fasta