alexyermanos / Platypus

R package for the analysis of single-cell immune repertoires
GNU General Public License v3.0
36 stars 16 forks source link

Loading VDJ failed invalid subscript type 'list'` #40

Open SamS0218 opened 1 year ago

SamS0218 commented 1 year ago

Hi,

Thank you for good tool! I am using Platypus v3.4.1 in windows PC to analyze BCR repertoire. I've followed the PlatypusV3 vignette and have been able to Extracting and integrating repertoire data with VDJ_GEX_matrix. But I cannot integrating my repertoire data with VDJ_GEX_matrix.

This is the command I used.

`VDJ.out.directory.list <- list()
VDJ.out.directory.list[[1]] <- c("E:/Single cell data/Platypus/H1_B/")
VDJ.out.directory.list[[2]] <- c("E:/Single cell data/Platypus/H2_B/")
VDJ.out.directory.list[[3]] <- c("E:/Single cell data/Platypus/H3_B/")
VDJ.out.directory.list[[4]] <- c("E:/Single cell data/Platypus/H4_B/")
VDJ.out.directory.list[[5]] <- c("E:/Single cell data/Platypus/H5_B/")
VDJ.out.directory.list[[6]] <- c("E:/Single cell data/Platypus/S1_B/")
VDJ.out.directory.list[[7]] <- c("E:/Single cell data/Platypus/S2_B/")
VDJ.out.directory.list[[8]] <- c("E:/Single cell data/Platypus/S3_B/")
VDJ.out.directory.list[[9]] <- c("E:/Single cell data/Platypus/S4_B/")
VDJ.out.directory.list[[10]] <- c("E:/Single cell data/Platypus/S5_B/")
GEX.out.directory.list <- list()
GEX.out.directory.list[[1]] <- c("E:/Single cell data/Platypus/H1_B/")
GEX.out.directory.list[[2]] <- c("E:/Single cell data/Platypus/H2_B/")
GEX.out.directory.list[[3]] <- c("E:/Single cell data/Platypus/H3_B/")
GEX.out.directory.list[[4]] <- c("E:/Single cell data/Platypus/H4_B/")
GEX.out.directory.list[[5]] <- c("E:/Single cell data/Platypus/H5_B/")
GEX.out.directory.list[[6]] <- c("E:/Single cell data/Platypus/S1_B/")
GEX.out.directory.list[[7]] <- c("E:/Single cell data/Platypus/S2_B/")
GEX.out.directory.list[[8]] <- c("E:/Single cell data/Platypus/S3_B/")
GEX.out.directory.list[[9]] <- c("E:/Single cell data/Platypus/S4_B/")
GEX.out.directory.list[[10]] <- c("E:/Single cell data/Platypus/S5_B/")

vgm <- VDJ_GEX_matrix(VDJ.out.directory.list = VDJ.out.directory.list,
                      GEX.out.directory.list = GEX.out.directory.list,
                      GEX.integrate = T,
                      VDJ.combine = T,
                      integrate.GEX.to.VDJ = T,
                      integrate.VDJ.to.GEX = T, 
                      exclude.GEX.not.in.VDJ = F,
                      filter.overlapping.barcodes.GEX = T,
                      filter.overlapping.barcodes.VDJ = T,
                      get.VDJ.stats = T,
                      parallel.processing = "parlapply", 
                      trim.and.align = T,  
                      group.id = c(1,1,1,1,1,2,2,2,2,2))

I received this error message.

Loading VDJ failed invalid subscript type 'list'

I checked my data files(filtered_contig_annotation.csv, clonotypes.csv, concat_ref.fasta, all_ontig_annotation.csv and metrics_summary.csv) , but it was not different from vignette data.

My data files are lacking in "vdj_reference", "cell_barcodes.json", "vdj.contig_info_pb" files.

Do you know the reason why I cannot load my VDJ data. If you have any idea, please let me know.

Thank you.

tudorcotet commented 1 year ago

Hi,

We are looking into this error and hope to fix it in the following days! It might be due to VDJ_GEX_matrix not finding your sequence references when trim.and.align is set to T. Could you set it to F and report the output?

SamS0218 commented 1 year ago

Hi,

Thank you for your reply! As you say, when I set "trim.and.align = F", I can load VDJ data. I would like to get full sequences for antibody expression and the number of somatic heypermutations, I would be grateful if you would fix it.

Thanks.

tudorcotet commented 1 year ago

Hi,

To get the SHM and productive sequences, we actually recommend using the VDJ_call_MIXCR function instead. This will output the annotated regions per chain, which can be later assembled into the full sequences for expression (VDJ_FR1, VDJCDR1, etc for heavy chains, VJ... for light), as well as SHM (VDJ_SHM, VJ_SHM). The trim.and.align option requires the references/germlines to be present in your files, whereas MIXCR does not.

We are currently looking into methods for inferring germlines when these are not in your files and plan to add this in the next update!

Let me know if this works for you!

SamS0218 commented 1 year ago

Hi,

Thank you for your reply! I have rewritten this message. I understand it. I tried using " VDJ_call_MIXCR function", but I got error message.

This is my commands.

`VDJ.out.directory.list <- list()
VDJ.out.directory.list[[1]] <- c("E:/Single cell data/Platypus/H1_B/")
VDJ.out.directory.list[[2]] <- c("E:/Single cell data/Platypus/H2_B/")
VDJ.out.directory.list[[3]] <- c("E:/Single cell data/Platypus/H3_B/")
VDJ.out.directory.list[[4]] <- c("E:/Single cell data/Platypus/H4_B/")
VDJ.out.directory.list[[5]] <- c("E:/Single cell data/Platypus/H5_B/")
VDJ.out.directory.list[[6]] <- c("E:/Single cell data/Platypus/S1_B/")
VDJ.out.directory.list[[7]] <- c("E:/Single cell data/Platypus/S2_B/")
VDJ.out.directory.list[[8]] <- c("E:/Single cell data/Platypus/S3_B/")
VDJ.out.directory.list[[9]] <- c("E:/Single cell data/Platypus/S4_B/")
VDJ.out.directory.list[[10]] <- c("E:/Single cell data/Platypus/S5_B/")
GEX.out.directory.list <- list()
GEX.out.directory.list[[1]] <- c("E:/Single cell data/Platypus/H1_B/")
GEX.out.directory.list[[2]] <- c("E:/Single cell data/Platypus/H2_B/")
GEX.out.directory.list[[3]] <- c("E:/Single cell data/Platypus/H3_B/")
GEX.out.directory.list[[4]] <- c("E:/Single cell data/Platypus/H4_B/")
GEX.out.directory.list[[5]] <- c("E:/Single cell data/Platypus/H5_B/")
GEX.out.directory.list[[6]] <- c("E:/Single cell data/Platypus/S1_B/")
GEX.out.directory.list[[7]] <- c("E:/Single cell data/Platypus/S2_B/")
GEX.out.directory.list[[8]] <- c("E:/Single cell data/Platypus/S3_B/")
GEX.out.directory.list[[9]] <- c("E:/Single cell data/Platypus/S4_B/")
GEX.out.directory.list[[10]] <- c("E:/Single cell data/Platypus/S5_B/")

vgm <- VDJ_GEX_matrix(VDJ.out.directory.list = VDJ.out.directory.list,
                      GEX.out.directory.list = GEX.out.directory.list,
                      GEX.integrate = T,
                      VDJ.combine = T,
                      integrate.GEX.to.VDJ = T,
                      integrate.VDJ.to.GEX = T, 
                      exclude.GEX.not.in.VDJ = F,
                      filter.overlapping.barcodes.GEX = T,
                      filter.overlapping.barcodes.VDJ = T,
                      exclude.on.cell.state.markers = c("CD3E"),
                      get.VDJ.stats = T,
                      parallel.processing = "none", 
                      trim.and.align = F,  
                      group.id = c(1,1,1,1,1,2,2,2,2,2))

vgm[[2]] <- GEX_phenotype(vgm[[2]], default = T)
Seurat::DimPlot(vgm[[2]],reduction = "umap", group.by = "cell.state")

gene_expression_cluster <- GEX_cluster_genes(vgm[[2]],min.pct = 0.25) 

VDJ_mixcr_out <- VDJ_call_MIXCR(VDJ = vgm[[1]], mixcr.directory = "E:/Single cell data/Platypus/mixcr.jar" ,species = "hsa", platypus.version = "v3", operating.system = "Windows", simplify = F)

I received this error message.

Missing required option: '--preset <name>'
Require tsv file type, got tempmixcrlc.out.vdjca

E:\Single cell data\Platypus>Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'tempmixcrhc.out.txt': No such file or directory

What is '--preset '? I checked my data files and "barcodes.tsv.gz", and "features.tsv.gz" exist in my directory. Do I need other tsv files? Could you tell me how to solve this error?

Thanks.