bernardo-de-almeida / DeepSTARR

Deep learning model built to quantitatively predict the activities of developmental and housekeeping enhancers from DNA sequence in Drosophila melanogaster S2 cells
MIT License
52 stars 10 forks source link

Converting predictions to bigWig #5

Open adamklie opened 9 months ago

adamklie commented 9 months ago

Hey Bernardo!

Random question. When you predict across the dm3 genome, how do you save the outputs and then convert them to bigWig tracks that nicely match the STARR-seq bigWigs in magnitude? Maybe convert the predictions to bigBed and then use something like bigBedToBigWig.sh?

Adam

bernardo-de-almeida commented 4 months ago

Hi Adam, sorry for the late reply. I save the outputs for each 249bp sequence tiled across the genome, and average them per nucleotide (across all overlapping tiles) to obtain genome-wide coverage. I used GenomicRanges in R for computing this coverage and rtracklayer to save as bigWigs. The STARR-seq bigWigs are in linear scale but the model predicts values in log scale, so you have to convert the predictions back to linear scale to match the STARR-seq ones.