MarineOmics / marineomics.github.io

Website for MarineOmics RCN-ECS. Hosts pages for panelist series and recommended practices for non-model genomics.
Creative Commons Attribution 4.0 International
5 stars 7 forks source link

Revise DNA Methylation tutorial #18

Closed sr320 closed 1 month ago

sr320 commented 12 months ago

Sam Bogan Review of “DNA Methylation Analyses” by Steven Roberts, Sam White, and Yaamini Venkataraman link: https://d.pr/f/332Q5v Original text is in blue. Suggested text is in red.

General Feedback

Introduction

Sequence Quality

Read Alignment

Methylation Quantification

sr320 commented 12 months ago

Review from Jason Johns: https://d.pr/f/zABmGv

yaaminiv commented 9 months ago

More things to add

yaaminiv commented 9 months ago

@sr320 this part of the SNP ID tutorial doesn't make sense to me:


If for example in downstream analysis you would might be using a _5x.tab file described above the following is a means in R to remove these SNPs.

# Read in CT SNP file
ct <- read.csv("../output/CT-SNP.vcf", header = FALSE, sep = "\t") %>%
  mutate(loci = paste0(V1, "_", V2))

# 1. List all files with _5x.tab suffix
files <- list.files(path = "../data/", pattern = "_5x.tab$", full.names = TRUE)

# 2. Iterate over each file
for(file in files) {

  # Extract base filename without the directory for naming purposes
  base_name <- basename(file)

  # Read the file
  data <- read.csv(file, header = FALSE, sep = "\t")

  # Modify the data
  modified_data <- data %>%
    mutate(loci = paste0(V1, "_", V2)) %>%
    anti_join(ct, by = "loci") %>%
    select(-loci)

  # Write the modified data to an output file
  output_file <- paste0("../output/f", base_name)
  write.table(modified_data, file = output_file, sep = "\t", row.names = FALSE, quote = FALSE, col.names = FALSE)
}
sr320 commented 1 month ago

completed and published