ncborcherding / scRepertoire

A toolkit for single-cell immune profiling
https://www.borch.dev/uploads/screpertoire/
MIT License
301 stars 49 forks source link

convert MiXCR format to scRepertoire format: Error in `df[[i]][, c(1, 2, 20, 25, 30, 35, 48, 49, 4)]` undefined columns selected #364

Closed yueli8 closed 4 months ago

yueli8 commented 4 months ago

Hello,

I have MiXCR data format and want convert it to scRepertoire format.

Thank you in advance for great help!

Best,

Yue

read_mixcr_n_trans<- function(file,...){
  df <- read.delim(file)
  # transform the cell id {be consist with RNA data}
  {
    well<- gsub(df$tagValueCELL,pattern = "[AGCT]*-",replacement = "")
    hp<- gsub(df$tagValueCELL,pattern = "-[AGCT]*-.*",replacement = "")
    rt<- str_extract(df$tagValueCELL,pattern = "-[AGCT]{10}")%>%
      str_sub(.,start = 2,end = nchar(.))
  }
  # add a column named "tagValueCELL"  {as normal mixcr output}
  df$tagValueCELL<- paste(well,hp,rt,sep ="_" )
  return(df)
}

# Step1: Load MIXCR output -----------------------------------------
fn36 <- "36.clones.tsv"
fn37 <- "37.clones.tsv"

filelist <- c(fn36,fn37)
samples <- c("A","B")

contig_list<- lapply(filelist, function(x) read_mixcr_n_trans(x))
colnames(contig_list[[1]])
[1] "cloneId"                "cellGroup"              "tagValueCELL"           "tagQualityCELL"        
 [5] "readCount"              "readFraction"           "uniqueMoleculeCount"    "uniqueMoleculeFraction"
 [9] "targetSequences"        "targetQualities"        "allVHitsWithScore"      "allDHitsWithScore"     
[13] "allJHitsWithScore"      "allCHitsWithScore"      "allVAlignments"         "allDAlignments"        
[17] "allJAlignments"         "allCAlignments"         "nSeqFR1"                "minQualFR1"            
[21] "nSeqCDR1"               "minQualCDR1"            "nSeqFR2"                "minQualFR2"            
[25] "nSeqCDR2"               "minQualCDR2"            "nSeqFR3"                "minQualFR3"            
[29] "nSeqCDR3"               "minQualCDR3"            "nSeqFR4"                "minQualFR4"            
[33] "aaSeqFR1"               "aaSeqCDR1"              "aaSeqFR2"               "aaSeqCDR2"             
[37] "aaSeqFR3"               "aaSeqCDR3"              "aaSeqFR4"               "refPoints"             
[41] "topChains"        
contig.list <- loadContigs(contig_list,  format = "MiXCR")

Error in `[.data.frame`(df[[i]], , c(1, 2, 20, 25, 30, 35, 48, 49, 4)) : 
  undefined columns selected
ncborcherding commented 4 months ago

Hey Yue,

Please use the updated scRepertoire from the github repo - loadContigs() no longer indexes by columns, but uses the actual column headers.

devtools::install_github("ncborcherding/scRepertoire")

Please let me know if you have any other issues.

Nick

yueli8 commented 3 months ago

@ncborcherding

Hello Nick,

Thank you so much for your quick response!

It works after updated scRepertoire by using the above command.

Thank you again and really appreciated!

Best,

Yue