immunomind / immunarch

🧬 Immunarch: an R Package for Fast and Painless Exploration of Single-cell and Bulk T-cell/Antibody Immune Repertoires
https://immunarch.com
Apache License 2.0
297 stars 65 forks source link

Permanent error in loading 10x data #365

Open beginner984 opened 1 year ago

beginner984 commented 1 year ago

Hi

Please help me in getting to load 10x data

> immdata_10x <- repLoad(file_path)

== Step 1/3: loading repertoire files... ==

Processing "/data/Continuum/Angel/Results/PN0340_0001/outs/per_sample_outs/PN0340_0001/vdj_t" ...
  -- [1/9] Parsing "/data/Continuum/Angel/Results/PN0340_0001/outs/per_sample_outs/PN0340_0001/vdj_t/airr_rearrangement.tsv" -- airr
Error in `select()`:                                                                                                                                      
! Can't subset columns that don't exist.
x Column `cdr1` doesn't exist.
Run `rlang::last_trace()` to see where the error occurred.
Warning message:
`select_()` was deprecated in dplyr 0.7.0.
ℹ Please use `select()` instead.
ℹ The deprecated feature was likely used in the dplyr package.
  Please report the issue at <https://github.com/tidyverse/dplyr/issues>.
This warning is displayed once every 8 hours.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated. 
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.6 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8    LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] immunarch_0.9.0    patchwork_1.1.2    data.table_1.14.8  dtplyr_1.1.0       dplyr_1.0.10       ggplot2_3.4.2     
[7] SeuratObject_4.0.3

Or from the other version

> immdata_10x <- repLoad(file_path)

== Step 1/3: loading repertoire files... ==

Processing "/data/Continuum/Angel/Results/PN0340_0001/outs/per_sample_outs/PN0340_0001/vdj_t" ...
  -- [1/9] Parsing "/data/Continuum/Angel/Results/PN0340_0001/outs/per_sample_outs/PN0340_0001/vdj_t/airr_rearrangement.tsv" -- airr
  -- [2/9] Parsing "/data/Continuum/Angel/Results/PN0340_0001/outs/per_sample_outs/PN0340_0001/vdj_t/cell_barcodes.json" --   -- [3/9] Parsing "/data/Continuum/Angel/Results/PN0340_0001/outs/per_sample_outs/PN0340_0001/vdj_t/clonotypes.csv" -- unsupported format, skipping
  -- [4/9] Parsing "/data/Continuum/Angel/Results/PN0340_0001/outs/per_sample_outs/PN0340_0001/vdj_t/concat_ref.bam.bai" -- Error in tolower(l) : invalid multibyte string 1
In addition: Warning message:
In readLines(f, 1) : line 1 appears to contain an embedded nul
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.6 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8    LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] immunarch_1.0.0    patchwork_1.1.2    data.table_1.14.8  dtplyr_1.1.0       dplyr_1.0.10       ggplot2_3.4.2     
[7] SeuratObject_4.0.3

or

tcr_data <- repLoad("filtered_contig_annotations.csv", .mode = "single", .coding = TRUE)

== Step 1/3: loading repertoire files... ==

  Processing "<initial>" ...
-- [1/1] Parsing "filtered_contig_annotations.csv" -- 10x (filt.contigs)
Error in step_subset(parent, vars = vars, groups = groups, arrange = arrange,  :                                                                          
                       is.null(j) || is_expression(j) is not TRUE
                     > tcr_data <- repLoad("filtered_contig_annotations.csv", .mode = "paired", .coding = TRUE)

                     == Step 1/3: loading repertoire files... ==

                       Processing "<initial>" ...
                     -- [1/1] Parsing "filtered_contig_annotations.csv" -- 10x (filt.contigs)
                     Error in step_subset(parent, vars = vars, groups = groups, arrange = arrange,  :                                                                          
                                            is.null(j) || is_expression(j) is not TRUE
                                          > 

Desperate to load the data :(

vadimnazarov commented 1 year ago

Hi, could you provide a couple of example datasets to test it?

vadimnazarov commented 1 year ago

Could you check if the latest cellranger version works please? Here: https://support.10xgenomics.com/single-cell-gene-expression/software/overview/welcome