cancerit / ascatNgs

Somatic copy number analysis using WGS paired end wholegenome sequencing
http://cancerit.github.io/ascatNgs/
GNU Affero General Public License v3.0
68 stars 17 forks source link

GC content file #27

Closed alkodsi closed 8 years ago

alkodsi commented 8 years ago

I am trying to create a custom probes positions to use with my exome sequences different than the provided SNP6 probes. For the GC_content file, I am using bedtools nuc to compute GC content in bins of length 200bp, 400bp, 1M, 10M etc around probe position. For example, the 10M bin spans 5M on each side, and if the probe distance to the beginning or end of the chromosome is less than 5M, I take 5M to one side and all remaining distance to beginning/end. Is that right? What does the column named "Probe" means in the file?

keiranmraine commented 8 years ago

Hi,

I have a response from the developers of the underlying R code:

The ‘probe’ column is a leftover from the Affymetrix ‘era’ for those files: it was the GC content of the actually array probe (25 base pairs, with the actual SNP right in the middle). It probably would be clearer if we called this ’25bp’.