can you tell me how to normalize from read counts?

marcelTBI / GenomeScreen

Scripts and data needed to run GenomeScreen

Other

4 stars 0 forks source link

Hi, I am not entirely sure, but I think we used this one https://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz

I think you are right. To be sure, you need to do this:

bin reads (you already have this)
GC correction of binned reads (no script provided in CNV_data/GenomeScreen repositories as it is quite standard procedure)
train pca normalization - either on your samples or on samples in CNV_data (python create_pca.py)
apply pca normalization on your samples - python add_pca.py
train means (optional) - either on your samples or on samples in CNV_data (python train_means.py) or use pretrained provided in CNV_data/GenomeScreen repos
run GenomeScreen

steps 3-5 are better described in https://github.com/marcelTBI/CNV_data repository. Hope it helps. This tool is no longer actively developed, so unfortunately I will not be able to help you more.

marcelTBI / GenomeScreen

can you tell me how to normalize from read counts? #6