Closed nick-youngblut closed 1 year ago
I now see where reverse reads are mentioned in the Maast + GT-Pro protocol:
In practice, users would genotype forward and reverse reads, if both are available. This can be done by supplying both forward and reverse reads as either two individual input files or a single concatenated file. The command does not need to be modified otherwise.
Yeah, just to confirm the details, since GT-Pro uses direct kmer matching, the pairing of reads doesn't contain any additional information beyond the deeper sequencing it provides. There is a bit of subtlety with statistical non-independence of the pair, and GT-Pro ignores this. But that effect is probably relatively small and I think it's pretty reasonable to treat it as "just more sequence" to count kmers in.
I do not see anything in the docs (
GT_Pro optimize
help docs, the repo README, or Shi et al., 2023) on handling paired-end reads.How should one handle paired-end reads with
GT_Pro optimize
:GT_Pro optimize read1.fastq read2.fastq
)?@bsmith89 did you use paired-end reads for genotyping with GT-Pro in order to generate the input for StrainFacts? I don't seen anything in the StrainFacts README about handling paired-end reads. I also can't find info in Smith et al., 2022 about paired-end reads (e.g., whether paired-end were used in the study).