Closed xinwang-bio closed 7 years ago
By default proovread assumes a more or less even coverage of the pacbio reads, i.e. 50X, 100X, .., which you specify via the --coverage
parameter. Because coverage is even and known, proovread then during the different iterations subsamples the reads sets for better speed, i.e. it runs three iterations with 33X each...
This behaviour does not make sense for iso-seq data, because the coverage for different reads with illumina data can differ a lot. --no-sampling
tells proovread to make no assumptions about the coverage and to not subsample during iterations. That way you get the best performance also for low coverage transcripts. Hope that helps.
Thank you very much. Have a good weekend.
I used proovread to correct my iso-seq data. I found that many researchers ask about how to use proovread on the iso-seq. You recommended that use the setting of --no-sampling. But I don't very clear about the meaning of this setting. What 's the different between use it and nor use it ?
Thank you