Open aliciagallego opened 8 months ago
Hi,
unfortunately, metagene2 does not support this kind of normalization intrinsically. This is why we added functions to access the dataframes though, so you could do it on your own the way you are doing now.
I'm not familiar with the biological problem you are trying to model here, so I cannot vouch that metagene2's resampling approach is appropriate (especially if you have a single BAM file), but you can find metagene2's code for its resampling strategy in the permutation.R file: https://github.com/ArnaudDroitLab/metagene2/blob/master/R/permutation.R
Hope this helps!
I -think- that since you're only dividing everything by a constant factor, you could also divide the old confidence intervals by the same factor and you'd get the same results, but that's just my gut instinct.
Thank you Eric, that worked perfectly well!
I -think- that since you're only dividing everything by a constant factor, you could also divide the old confidence intervals by the same factor and you'd get the same results, but that's just my gut instinct.
Hello,
I would like to do metaplots based on only one BAM file and several sets of regions. However, for each set region, I want to normalize each bin by the total coverage across all bins in that region set. This is because I am interested in measuring the slope of the metaplot removing the effect of the expression levels. I have tried to do it manually (code explained below) but I have a couple of questions.
Manual approximation: I have modified the data frame (df) obtained with the function
add_metadata
. So, I first calculated the sum of the coverages across all bins in each set of regions (A and B). Second, I divided the coverage of each bin by the total sum calculated in the previous step:mg <- metagene2$new(regions = regionsAB, bam_files = bam_file)
mg$group_coverages(normalization="RPM")
df <- mg_ali$add_metadata()
total_A<-sum(df[which(df$region_name=="listA"),1])
total_B<-sum(df[which(df$region_name=="listB"),1])
for(i in 1:nrow(df)){ if(df[i,"region_name"]=="listA"){ df[i,1]<-df[i,1]/total_A} else if (df[i,"region_name"]=="listB"){ df[i,1]<-df[i,1]/total_B}}
metaplot <- plot_metagene(df)
Results: To illustrate what I want to do, I show you the original figure just normalizing by using
$group_coverages(normalization="RPM")
function (up) and the one that I generated after applying manual normalization of the bins (bottom).Questions: