Open pkozyulina opened 6 years ago
Dear Polina,
I think there is a discrepancy between the reference genome you ar using and the gc_percentages matrix provided by NIPTeR. Probably the number of columns in the binned data matrix of the sample is not equal to the gc_percentages matrix provided in the sysdata.rda file. This matrix shows the gc percentages of all 50.000 bp bins on the genome. Unfortunately there currently is not an easy option to update the gc_percentages_hg37 or 38 matrices to fit the reference genome you used.
An option would be to recreate the relevant matrix to fit your genome version, having the different chromosomes in the rows and the 50.000 bp bins in the columns showing the GC percentage in each cell. Bins in which no GC percentage can be calculated should be numbered as -1. The total number of columns in that file should than match the number of columns of your sample file. The matrix should then replace the old gc_percentages_hg37 or 38 file in the sysdata.rda.
Note that only the gc correction is affected by these matrices. If you skip this step the workflow should be able to finish for all reference genomes. Though I realize that including gc correction gives the best results.
Regards, Lennart
Hi! I was wondering if anybody could give me an insight on what can be causing the following error:
The file I am loading seems to be perfectly ordinary, without any obvious quality control problems. The rest of the samples from the same sequencing run were successfully analyzed without any errors. What could have gone wrong with this one and is there a way to fix the issue?
Thank you! Regards, Polina