caravagnalab / rRACES

R wrapper for the RACES package
GNU General Public License v3.0
2 stars 1 forks source link

Simulated read error model #75

Closed albertocasagrande closed 8 months ago

albertocasagrande commented 9 months ago

Provide an error model for some sequencing technologies (e.g., NovaSeq).

giorgiagandolfi commented 9 months ago

According to Stoler et al. Sequencing error profiles of Illumina sequencing instruments NovaSeq 6000 error rate (%) is about 0.109, calculated over a total of 239 samples. Similarly I checked the error rate of a total on in-house 27 WGS samples (normal and tumor). Error rate was calulated using samtools stats command view which provides various information about the quality of the reads and of the alignment. By selecting only error rate measure (calculated as # mismatches / bases mapped (cigar)) I calculated a median over the error rates of all samples getting a value of 5.07e-03.

caravagn commented 9 months ago

Take that value that you think is more reasonable and set it as default @albertocasagrande -- then let the user change iif they wish so.

albertocasagrande commented 8 months ago

Closed by commit 58bc414ebafbb8e2869acb59a276d4a685a1edef.