Open rmzelle opened 6 years ago
Is this a common feature? I mean, do you think many other people will use it if I implement it?
With the exception of NGSUtils, I couldn't find any other tools or scripts to split reads within a FASTQ file into smaller reads (with or without overlap). So it's probably not very commonly needed, but this might change as long-read sequencing becomes more popular.
In my case, I'm trying to accurate determine gene copy number in a genome via relative Nanopore read coverage, but my target gene has multiple repeats on a scaffold that is about the same size as the median read length of my Nanopore reads. I expect to get more accurate results if I can chop my reads up in shorter fragments before aligning them to my reference genome, but so far I haven't found an existing tool to do that.
@rmzelle This request doesn't sound like something that should be a feature of a trimming tool. It seems to be a specialized request that may require programming something from the ground up.
It seems to be a specialized request that may require programming something from the ground up.
Sure. Feel free to close this ticket if this is considered out of scope and/or too esoteric.
I'm looking for a tool to split reads in a fastq.gz file into shorter fragments, e.g. to split up Oxford Nanopore reads into non-overlapping 500 bp chunks. I've tried http://ngsutils.org/modules/fastqutils/tile/ but it doesn't seem to work well with large read files (I get an "Too many open files" error as it writes too many temp files), and it doesn't look like it can output to STDOUT.
Is this a feature you'd be willing to add to fastp?