ncborcherding / scRepertoire

A toolkit for single-cell immune profiling
https://www.borch.dev/uploads/screpertoire/
MIT License
305 stars 49 forks source link

Error in combineTCR #73

Closed dkcoxie closed 3 years ago

dkcoxie commented 3 years ago

Looked for examples for creating the input object from the 10x Cell Ranger output file and loaded the contig_list example data and used that to compare my input data list/nested data.frame formats. But repeatedly getting getting an error that I find confusing when trying to run the combineTCR function.

SessionInfo summary: R version 4.0.3, Bioc 1.30.10, scRepertoire 1.0.0 (installed from Bioconductor)

My data: VDJ_data - Large list generated from reading in 8 different ‘filtered_contig_annotations.csv’ inputs (in separate directories so read in via a loop), but read csv line is: VDJ_data[[j]] = read.csv(dirname, colClasses = VDJClasses), where j is a numeric index for the list assignment (1-8), dirname is the full path to the csv file for each sample, and VDJClasses is a list of class assignments pulled from the example contig_list dataset (since logical columns were encoded as lower case/strings in original files). My barcodes are formatted as 'AAAGCAAGTCGACTAT-1' so did not strip the barcodes for any of my input data. Did have empty fields so replaced them with 'None' to match example but triggered the same error with/without filling in the blank fields.

Compared column names between the example data and my data: colnames(contig_list[[1]]) == colnames(VDJ_data[[1]]) # Returns all TRUE

Also compared classes between the example data and my data: sapply(VDJ_data[[1]], class) == sapply(contig_list[[1]], class) # Returns TRUE for each column

When I try to combine the example data (following the code in the vignette), 'combineTCR' works. However, when I try to execute the same command with my data I get an error: combinedVDJ <- combineTCR(VDJ_data, samples = c(YoungA1, YoungA2, YoungA3, YoungA4, AgedA1, AgedA2, AgedA3, AgedA4), ID = c(Young, Young, Young, Young, Aged, Aged, Aged, Aged), cells = "T-AB") Error is: Error in '$<-.data.frame'('*tmp*', "sample", value="YoungA1") : replacement has 1 row, data has 0

However, I see my data in VDJ_data, i.e. head(VDJ_data[[1]], n = 1 returns:

Screen Shot 2021-05-03 at 5 55 42 PM

Happy to provide additional replication information but hoping there is something very obvious that I'm missing. Thanks for your help!

ncborcherding commented 3 years ago

Hey Dana,

Thank you for the great run down!!

My first guess is this is a result of using Cell Ranger >= 5.0 - 10x Genomics changed their outputs slightly in the newer versions and it has resulted in this error.

My suggestion is to try the github repo to get the most-up-to-date version with:

devtools::install_github("ncborcherding/scRepertoire")

If the problem persists, please let me know and I would be happy to troubleshoot with you to figure things out.

Thanks again, Nick

dkcoxie commented 3 years ago

Awesome, thanks for the prompt reply. Will install the current version of scRepertoire instead of the Bioconductor version & report back.

dkcoxie commented 3 years ago

The github version worked great - matched my colClasses to the example data when reading in the csv and then had no error with combineTCR().

mbartl13 commented 3 years ago

Hi- I am getting a similar error for combineTCR using v4 cellranger output but using the dev version did not solve:

> S1 <- read.csv("13F_TCR_20210607/outs/filtered_contig_annotations.csv")
> S2 <- read.csv("43F_TCR_20210607/outs/filtered_contig_annotations.csv")
> contig_list <- list(S1, S2)
> for (i in seq_along(contig_list)) {
+   contig_list[[i]] <- stripBarcode(contig_list[[i]], 
+                                    column = 1, connector = "_", num_connects = 1)
+ }

> combined <- combineTCR(contig_list, 
+                        samples = c("13F", "43F"), 
+                        ID = c("S1", "S2"), cells ="T-AB")
Error in `$<-.data.frame`(`*tmp*`, "sample", value = "13F") : 
  replacement has 1 row, data has 0

``

` session_info()

  • Session info ------------------------------------------------------------------------------------------- setting value
    version R version 4.1.0 (2021-05-18) os Windows 10 x64
    system x86_64, mingw32
    ui RStudio
    language (EN)
    collate English_United States.1252
    ctype English_United States.1252
    tz America/New_York
    date 2021-08-05

[1] C:/Users/Maggie/Documents/R/win-library/4.1 [2] C:/Program Files/R/R-4.1.0/library`

> head(contig_list[[1]]) barcode is_cell contig_id high_confidence length chain v_gene d_gene j_gene 1 AAACCTGAGAAACGCC-1 TRUE AAACCTGAGAAACGCC-1_contig_1 TRUE 668 NONE NONE NONE NONE 2 AAACCTGAGAAACGCC-1 TRUE AAACCTGAGAAACGCC-1_contig_2 TRUE 516 NONE NONE NONE NONE 3 AAACCTGAGATGCCAG-1 TRUE AAACCTGAGATGCCAG-1_contig_1 TRUE 640 NONE NONE NONE NONE 4 AAACCTGAGATGCCAG-1 TRUE AAACCTGAGATGCCAG-1_contig_2 TRUE 482 NONE NONE NONE NONE 5 AAACCTGAGATGCCAG-1 TRUE AAACCTGAGATGCCAG-1_contig_3 TRUE 421 NONE NONE NONE NONE 6 AAACCTGAGATGCGAC-1 TRUE AAACCTGAGATGCGAC-1_contig_1 TRUE 564 NONE NONE NONE NONE c_gene full_length productive cdr3 cdr3_nt reads 1 NONE FALSE FALSE CIVIQAGTALIF TGCATCGTGATCCAGGCTGGAACTGCACTGATCTTT 328 2 NONE FALSE FALSE CASSEWTGGNQPQYF TGCGCCAGCAGTGAGTGGACAGGGGGGAATCAGCCCCAGTATTTT 299 3 NONE FALSE FALSE CAVRAGGAGYGKLTF TGTGCTGTGAGGGCTGGTGGTGCTGGCTATGGAAAGCTGACATTT 140 4 NONE FALSE FALSE CASSSTGAGEKLFF TGCGCCAGCAGCTCGACAGGGGCTGGTGAAAAACTGTTTTTT 970 5 NONE FALSE FALSE CAVSDRTTSYNKLMF TGTGCTGTGAGTGACAGAACAACCTCCTACAACAAGCTGATGTTT 315 6 NONE FALSE FALSE CASSEADWGYRTDPQYF TGTGCCAGCAGTGAAGCGGACTGGGGTTATCGCACAGATCCGCAGTATTTT 1261 umis raw_clonotype_id raw_consensus_id 1 7 NONE NONE 2 5 NONE NONE 3 4 NONE NONE 4 5 NONE NONE 5 3 NONE NONE 6 14 NONE NONE

ncborcherding commented 3 years ago

Hey mbartl13,

The most common reason for this error is that the filtering steps in combineTCR() results in 0 contigs left. There are 2 major filters - 1) for productive chains only and 2) removal of chains designated as "Multi".

Based on what your showing in head(contig_list[[1]]), I'm concerned that possible your contigs are not "productive" as I do not see a chain, or v/d/j genes listed? But please let me know if that is not the case. If just one of the list elements has this issue, it will result in the same error.

Please let me know and I will be happy to troubleshoot with you.

Nick

mbartl13 commented 3 years ago

Hi Nick- thanks looks like they were missed during cellranger. I'll repeat and try again.

Thank you! Maggie


From: theHumanBorch @.> Sent: Thursday, August 5, 2021 6:01 PM To: ncborcherding/scRepertoire @.> Cc: Maggie Bartlett @.>; Comment @.> Subject: Re: [ncborcherding/scRepertoire] Error in combineTCR (#73)

  External Email - Use Caution

Hey mbartl13,

The most common reason for this error is that the filtering steps in combineTCR() results in 0 contigs left. There are 2 major filters - 1) for productive chains only and 2) removal of chains designated as "Multi".

Based on what your showing in head(contig_list[[1]]), I'm concerned that possible your contigs are not "productive" as I do not see a chain, or v/d/j genes listed? But please let me know if that is not the case. If just one of the list elements has this issue, it will result in the same error.

Please let me know and I will be happy to troubleshoot with you.

Nick

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fncborcherding%2FscRepertoire%2Fissues%2F73%23issuecomment-893835686&data=04%7C01%7Cmbartl13%40jh.edu%7C2a07b4f1ae454964d62908d9585c9b0b%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637637977023251394%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=WVRN4ZbWZ1covQ53ADnPlcJ8V7X27HMTEpHH1kTMRz8%3D&reserved=0, or unsubscribehttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FARPWQ5AFID4RMDXSBF2VQTLT3MC4HANCNFSM44BSACCA&data=04%7C01%7Cmbartl13%40jh.edu%7C2a07b4f1ae454964d62908d9585c9b0b%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637637977023251394%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=cPbIsXsFtVkz0wh8k5R%2BRsCH%2B03ITJJHQlEW0WO4uNA%3D&reserved=0. Triage notifications on the go with GitHub Mobile for iOShttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7Cmbartl13%40jh.edu%7C2a07b4f1ae454964d62908d9585c9b0b%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637637977023261391%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=w4XZO49UOg4b0n4FKOAQoZsNZsmBsCwmJSQQLD9sPAc%3D&reserved=0 or Androidhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26utm_campaign%3Dnotification-email&data=04%7C01%7Cmbartl13%40jh.edu%7C2a07b4f1ae454964d62908d9585c9b0b%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637637977023261391%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=IBt9UrCme%2BMsoMpa7jo2e54gXLD7wtbQGvVU2sK96wU%3D&reserved=0.