Post Processing Bug - Githubissues

TheJacksonLaboratory / splicing-pipelines-nf

Repository for the Anczukow-Lab splicing pipeline

14 stars 9 forks source link

Post Processing Bug #309

Closed angarb closed 1 year ago

angarb commented 2 years ago

Problem

There is a bug in the generation of the BamList. If there are more than 10 samples in b1 or b2 - the Bamlist gets disordered.

Solution

angarb commented 2 years ago

This part of the post processing script is causing problems:

b1_list <- t(read.table(b1_file, sep=","))
b2_list <- t(read.table(b2_file, sep=","))

bam_list_df <- merge(b1_list, b2_list, by="row.names", suffixes = c("_b1", "_b2"))
bam_list_df <- bam_list_df[,-1]
colnames(bam_list_df) <- lapply(colnames(bam_list_df), function(x){
  new_val <- strsplit(x, "_")[[1]][2]
  return(new_val)
})

angarb commented 2 years ago

It currently reorders the row.names to look like this:

angarb commented 2 years ago

This seems to fix this issue: bam_list_df <- merge(b1_list, b2_list, by="row.names", all= T, sort = F, suffixes = c("_b1", "_b2"))