sr320 / ceabigr

Workshop on genomic data integration with a emphasis on epigenetic data (FHL 2022)
4 stars 2 forks source link

Run intersectbed with 10methylation and TE track #32

Closed sr320 closed 2 years ago

yaaminiv commented 2 years ago

@sr320 @jarcasariego is anyone doing this? if not I can start

sr320 commented 2 years ago

I am

Done and at https://gannet.fish.washington.edu/seashell/bu-github/ceabigr/output/

../output/${NAME}mTE.out

laurahspencer commented 2 years ago

Here is r code to summarize methylation (mean, median) per transposon per individual using a for loop


# create a vector of filenames with full path 
filenames <- list.files(path = "../../../../8TB_HDD_01/sr320/github/ceabigr/output", pattern = "mTE.out", full.names = TRUE)  

b <- data.frame()   # create empty dataframe to be populated with mean & median summary stats for each feature within a sample 

for (i in 1:length(filenames)) {  
  print(filenames[i])  # print out file location and name 
  testMte <- read.csv(file = filenames[i], sep = "\t", header = FALSE) # read in each sample data 
    # summarize methylation data per feature, mean & median 
    group12 <- testMte %>% group_by(V13,V5, V8, V9) %>%   
    summarize(avg = mean(V4, na.rm=TRUE), median=median(V4, na.rm=TRUE)) %>%

    # add new column with sample name 
    mutate(sample=gsub("../../../../8TB_HDD_01/sr320/github/ceabigr/output/", "", filenames[i]))
  b <- rbind(b, group12)
}
colnames(b) <- c("feature_name","mean_meth", "median_meth", "sample")

write.table(b, file = "../../../../8TB_HDD_01/sr320/github/ceabigr/output/transposon_summary_allsamples.txt",quote = F, row.names = F, sep = "\t")