alexyermanos / Platypus

R package for the analysis of single-cell immune repertoires
GNU General Public License v3.0
36 stars 16 forks source link

Error in VDJ_clonotype #21

Closed meaksu closed 1 year ago

meaksu commented 1 year ago

Hi, I'm trying to follow along with the vignette, and when I run this line I'm getting an error.

vgm[[1]] <- VDJ_clonotype(platypus.version = "v3", VDJ = vgm[[1]], clone.strategy = "cdr3.aa", global.clonotype = F, VDJ.VJ.1chain = F, hierarchical = "double.and.single.chains")

Clonotyping strategy: cdr3.aa Error in 1:nrow(VDJ) : argument of length 0

Do you know what is causing this?

vickreiner commented 1 year ago

Hi!

We have recently deprecated the old VDJ_clonotype function and replaced it with a more capable one. As far as the parameters go, it seems like you are using this newer version already.

Did you pull this from Github or are you running the last Platypus Cran release?

And just as a check, can you print _table(vgm[[1]]Nr_of_VDJ_chains) table(vgm[[1]]Nr_of_VJchains)

Thanks!

meaksu commented 1 year ago

Thanks for your quick reply.

I'm not sure whether I downloaded the package via Github or Cran but the version is Platypus_3.3.6.

Both of those statements give me errors:

> table(vgm[[1]]Nr_of_VDJ_chains)
Error: unexpected symbol in "table(vgm[[1]]Nr_of_VDJ_chains"
> table(vgm[[1]]Nr_of_VJ_chains)
Error: unexpected symbol in "table(vgm[[1]]Nr_of_VJ_chains"

Putting a dollar sign gives me the following:

> table(vgm[[1]]$Nr_of_VDJ_chains)
< table of extent 0 >
> table(vgm[[1]]$Nr_of_VJ_chains)
< table of extent 0 >

However, I have 64 samples, so I can access those columns to print normally by using:

> table(vgm[["VDJ"]][[1]]$Nr_of_VDJ_chains)
  0   1   2 
 27 495   7 

> table(vgm[["VDJ"]][[1]]$Nr_of_VJ_chains)
  1   2 
513  16 

Where the [[1]] can be replaced from a number up to 64. Could there be a problem with the structure of my vgm object?

Thanks

vickreiner commented 1 year ago

Hi!

Sorry I missed the $ sign while typing this.

There is certainly something not quite right with the vgm object.

Could you print(names(vgm))

Did you modify the vgm list object after running the VDJ_GEX_matrix function?

Thanks!

meaksu commented 1 year ago

Hi,

Running that line gives me the following:

> print(names(vgm))
[1] "VDJ"            "GEX"            "VDJ.GEX.stats"  "Running params" "sessionInfo"  

I didn't modify the object at all after the function, this is the first function I'm trying to run.

However, I may have gotten the code to run by using vgm[["VDJ"][[1]]

> vgm[["VDJ"]][[1]] <- VDJ_clonotype(platypus.version = "v3", VDJ = vgm[["VDJ"]][[1]], clone.strategy = "cdr3.aa", global.clonotype = F, VDJ.VJ.1chain = F, hierarchical = "double.and.single.chains")
Clonotyping strategy: cdr3.aa
Filtered out 7 cells containing more than one VDJ AND VJ chain or two VDJ chains, as these likely correspond to doublets
Found 0 exact matching clones with 3 chains and a frequency of at least 5. These will be used as high confidence clonotypes.
Processing sample 1 of 1
Attempting to merge in 27 aberrant cells
Backing up 10x default clonotyping in columns clonotype_id_10x and clonotype_frequency_10x before updating clonotype_id and clonotype_frequency columns 

Would it be possible that I just have to loop through the samples using vgm[["VDJ"]][[i]] using i from 1 to 64, instead of trying to do vgm[[1]]?

alexyermanos commented 1 year ago

also can you please post the code you used to generate the initial VGM with the 64 samples, as there is something wrong with the structure.

In response to this "I'm not sure whether I downloaded the package via Github or Cran but the version is Platypus_3.3.6."

can you try using this function here (eg not using the normal package call) - https://github.com/alexyermanos/Platypus/blob/master/R/VDJ_clonotype.R but as Victor mentioned youll probably need to use it on each sub-vgm given the weird structure. normally the VGM[[1]] should be all samples in one dataframe

vickreiner commented 1 year ago

Hi!

Thanks for the additional details. It seems to me you have run the VGM with VDJ.combine = FALSE This will return a list of dataframes in the VDJ slot. Currently Platypus functions are written so to take a single dataframe and then (in case its needed) iterate automatically over unique entries in the sample_id column. (e.g. VDJ_clonotype(... global.clonotype = FALSE) will iterate automattically)

I therefore suggest to run the VGM again but with VDJ.combine = TRUE and this should resolve the issue. Sorry that it took us longer to understand then it should have.

meaksu commented 1 year ago

Thanks, this solved it. I actually did have VDJ.combine set to TRUE but I think since the samples were half TCR and half BCR, the combination didn't work. Splitting the samples into separate BCR and TCR objects now allows the samples to be combined.

vickreiner commented 1 year ago

Thats great, and indeed the function is only meant to take either T or B cell input. The fact that your run did not crash now has us to decide whether to make this a feature or fix it ;)