Closed XubCherif closed 2 years ago
Hi XubCherif, During our project we worked with the 1000G phase 1 reference genome for build 37. However, in principle since all builds should have the same size all should work. We have noticed that some users experienced errors in the GC correction step because of differences in the number of 50,000 bp bins, I'm not sure which reference they have used. I would use a reference without any ALT sequences. Your other suggestions are good. You will already remove some noise in these preprocessing step. The chi-squared variation reduction algorithm can then correct for any remaining variation. Note that with the expected input ultra-low coverage WGS, you would not expect many duplicate reads.
More information can be found in the NIPTeR papers: especially in Additional file 1 of the application note: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2557-8 Algorithm information: https://www.nature.com/articles/s41598-017-02031-5
Hi @ljohansson, Thanks noted. PS: very nice papers and Package
Hi, Please didn't find any recommendation on how the Bam file should be preprocessed:
I'm asking because these steps can influence the result / mandatory with others tools /Package (is is the case). Any link to best NIPTeR best practice will be appreciated
Many Thanks.