Closed sr320 closed 1 month ago
Review from Jason Johns: https://d.pr/f/zABmGv
More things to add
@sr320 this part of the SNP ID tutorial doesn't make sense to me:
If for example in downstream analysis you would might be using a _5x.tab file described above the following is a means in R to remove these SNPs.
# Read in CT SNP file
ct <- read.csv("../output/CT-SNP.vcf", header = FALSE, sep = "\t") %>%
mutate(loci = paste0(V1, "_", V2))
# 1. List all files with _5x.tab suffix
files <- list.files(path = "../data/", pattern = "_5x.tab$", full.names = TRUE)
# 2. Iterate over each file
for(file in files) {
# Extract base filename without the directory for naming purposes
base_name <- basename(file)
# Read the file
data <- read.csv(file, header = FALSE, sep = "\t")
# Modify the data
modified_data <- data %>%
mutate(loci = paste0(V1, "_", V2)) %>%
anti_join(ct, by = "loci") %>%
select(-loci)
# Write the modified data to an output file
output_file <- paste0("../output/f", base_name)
write.table(modified_data, file = output_file, sep = "\t", row.names = FALSE, quote = FALSE, col.names = FALSE)
}
completed and published
Sam Bogan Review of “DNA Methylation Analyses” by Steven Roberts, Sam White, and Yaamini Venkataraman link: https://d.pr/f/332Q5v Original text is in blue. Suggested text is in red.
General Feedback
Introduction
Sequence Quality
Read Alignment
Methylation Quantification