Open ctparson opened 3 years ago
I have a similar problem. Using PacBio HiFi reads, I filtered the reads of the genome I'm working with and assembled them with Flye. After that, I mapped the reads back to the assembled genome, following the suggested pipeline, and ran Strainberry iteratively with up to 5 strains. I have >400 scaffolds in the assembly.scaffolds.fa file. Is there a way to know to which strain does each scaffold belongs just with the scaffold's name? or to know which contig were generated with which phased set of reads?
I have a similar problem,too. Using nanopore reads,I combined the reads of the three known species together and assembled them with Flye. After that, I mapped the reads back to the assembled genome, following the suggested pipeline, and ran Strainberry iteratively with preset value. I have >500scaffolds in the assembly.scaffolds.fa file. Is there a way to know to which strain does each scaffold belongs just with the scaffold's name? or to know which contig were generated with which phased set of reads?or where are the SNPs of the three strains in the gene?(ex:VCF format)
I know my input metagenome only contained two strains of a given species, and had been pretty heavily filtered to only have the reads from that species in the metagenome assembly, however, when I perform the analysis with strainberry my resulting assembly.scaffolds.fa has just over 1000 scaffolds in it, are there any thoughts or suggestions on how to correct this.