ncborcherding / scRepertoire

A toolkit for single-cell immune profiling
https://www.borch.dev/uploads/screpertoire/
MIT License
302 stars 49 forks source link

Trouble with expression2List #116

Closed annecar closed 2 years ago

annecar commented 2 years ago

Hi Nick, Thanks for this great package! Very useful.

I am running into an issue right now, wanting to group my data differently using expression2List. I have several metadata columns, and strangely, expression2List seems to work with some columns but not with all of them. I keep getting the error "subscript out of bounds", for example:

combined2 <- expression2List(COVPat_TCR_freq, group = "sample")

clonalOverlap(combined2, cloneCall="aa", method="overlap") Error in df[[i]] : subscript out of bounds:

When i look at combined 2 i see the following error message "Error in as.character.factor(x) : malformed factor"

combined2 $p1_1d Error in as.character.factor(x) : malformed factor

Other columns contain just numbers, or also text, and they also give an error message "Error in as.character.factor(x) : malformed factor" but the command works and yields an overlap plot.: combined2 <- expression2List(COVPat_TCR_freq, group = "new.cluster.ids") clonalOverlap(combined2, cloneCall="aa", method="overlap") Warning message: Removed 6 rows containing missing values (geom_text).

combined2 $Proliferating Error in as.character.factor(x) : malformed factor

Wouldyou have a clue what is wrong with the metadata column that do not yield a plot??

Thanks a lot annecar

ncborcherding commented 2 years ago

Hey Annecar,

Thanks for reaching out and happy to help. Could you do me a favor and give me the output of sessioninfo()?

It looks like something is malformed in the generation of the object from expression2List() - so I doubt any of the visualization functions will work until we fix that.

Alternatively, you might try the developmental version of scRepertoire, which no longer requires expression2List(), but can perform most of the analyses on the Seurat object itself. It can be installed with:

devtools::install_github("ncborcherding/scRepertoire@dev")

Looking forward to the sessioninfo and some troubleshooting to fix things.

Nick

annecar commented 2 years ago

Dear Nick, Thank you very very much for your quick reply!!

Here is the session info:

sessionInfo(package= NULL) R version 4.0.4 (2021-02-15) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Catalina 10.15.7

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

Random number generation: RNG: Mersenne-Twister Normal: Inversion Sample: Rounding

locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] scales_1.1.1 circlize_0.4.13 tidyr_1.1.3 scRepertoire_1.3.2 ggplot2_3.3.3 [6] patchwork_1.1.1 SeuratObject_4.0.1 Seurat_4.0.2 dplyr_1.0.7

loaded via a namespace (and not attached): [1] utf8_1.2.1 reticulate_1.20 tidyselect_1.1.1 [4] htmlwidgets_1.5.3 grid_4.0.4 BiocParallel_1.24.1 [7] Rtsne_0.15 devtools_2.4.2 munsell_0.5.0 [10] codetools_0.2-18 ica_1.0-2 statmod_1.4.36 [13] scran_1.18.5 future_1.21.0 miniUI_0.1.1.1 [16] withr_2.4.2 colorspace_2.0-1 Biobase_2.50.0 [19] ggalluvial_0.12.3 stats4_4.0.4 SingleCellExperiment_1.12.0 [22] ROCR_1.0-11 tensor_1.5 listenv_0.8.0 [25] labeling_0.4.2 MatrixGenerics_1.2.1 GenomeInfoDbData_1.2.4 [28] polyclip_1.10-0 farver_2.1.0 rprojroot_2.0.2 [31] parallelly_1.25.0 vctrs_0.3.8 generics_0.1.0 [34] xfun_0.23 R6_2.5.0 doParallel_1.0.16 [37] GenomeInfoDb_1.26.2 rsvd_1.0.5 VGAM_1.1-5 [40] locfit_1.5-9.4 bitops_1.0-7 spatstat.utils_2.1-0 [43] cachem_1.0.5 DelayedArray_0.16.2 assertthat_0.2.1 [46] promises_1.2.0.1 gtable_0.3.0 beachmat_2.6.4 [49] globals_0.14.0 processx_3.5.2 goftest_1.2-2 [52] rlang_0.4.11 GlobalOptions_0.1.2 splines_4.0.4 [55] lazyeval_0.2.2 spatstat.geom_2.1-0 BiocManager_1.30.15 [58] reshape2_1.4.4 abind_1.4-5 httpuv_1.6.1 [61] tools_4.0.4 usethis_2.0.1 cubature_2.0.4.2 [64] ellipsis_0.3.2 spatstat.core_2.1-2 RColorBrewer_1.1-2 [67] BiocGenerics_0.36.0 sessioninfo_1.1.1 ggridges_0.5.3 [70] Rcpp_1.0.6 plyr_1.8.6 sparseMatrixStats_1.2.1 [73] zlibbioc_1.36.0 purrr_0.3.4 RCurl_1.98-1.3 [76] ps_1.6.0 prettyunits_1.1.1 rpart_4.1-15 [79] deldir_0.2-10 pbapply_1.4-3 cowplot_1.1.1 [82] S4Vectors_0.28.1 zoo_1.8-9 SummarizedExperiment_1.20.0 [85] ggrepel_0.9.1 cluster_2.1.2 fs_1.5.0 [88] tinytex_0.32 magrittr_2.0.1 data.table_1.14.0 [91] scattermore_0.7 SparseM_1.81 lmtest_0.9-38 [94] RANN_2.6.1 truncdist_1.0-2 fitdistrplus_1.1-5 [97] matrixStats_0.59.0 pkgload_1.2.1 gsl_2.1-6 [100] mime_0.10 xtable_1.8-4 shape_1.4.6 [103] IRanges_2.24.1 gridExtra_2.3 testthat_3.0.2 [106] compiler_4.0.4 tibble_3.1.2 KernSmooth_2.23-20 [109] crayon_1.4.1 htmltools_0.5.1.1 mgcv_1.8-36 [112] later_1.2.0 powerTCR_1.10.3 DBI_1.1.1 [115] MASS_7.3-54 Matrix_1.3-4 permute_0.9-5 [118] cli_2.5.0 parallel_4.0.4 evd_2.3-3 [121] igraph_1.2.6 GenomicRanges_1.42.0 pkgconfig_2.0.3 [124] plotly_4.9.3 scuttle_1.0.4 spatstat.sparse_2.0-0 [127] foreach_1.5.1 dqrng_0.3.0 stringdist_0.9.7 [130] XVector_0.30.0 stringr_1.4.0 callr_3.7.0 [133] digest_0.6.27 sctransform_0.3.2 RcppAnnoy_0.0.18 [136] vegan_2.5-7 spatstat.data_2.1-0 leiden_0.3.8 [139] uwot_0.1.10 edgeR_3.32.1 DelayedMatrixStats_1.12.3 [142] evmix_2.12 shiny_1.6.0 lifecycle_1.0.0 [145] nlme_3.1-152 jsonlite_1.7.2 BiocNeighbors_1.8.2 [148] desc_1.3.0 viridisLite_0.4.0 limma_3.46.0 [151] fansi_0.5.0 pillar_1.6.1 lattice_0.20-44 [154] fastmap_1.1.0 httr_1.4.2 pkgbuild_1.2.0 [157] survival_3.2-11 glue_1.4.2 remotes_2.4.0 [160] png_0.1-7 iterators_1.0.13 bluster_1.0.0 [163] stringi_1.6.2 BiocSingular_1.6.0 memoise_2.0.0 [166] irlba_2.3.3 future.apply_1.7.0

I will install also install the dev version. Thank for looking into it!! Beast Anne

Anne Eugster, PhD Scientist AG Bonifacio

Technische Universität Dresden Center for Molecular and Cellular Bioengineering (CMCB) Center for Regenerative Therapies (CRTD) Fetscherstr. 105 01307 Dresden

Tel: +49 (0)351-458-821 41 Fax: +49 (0)351-458-821 09 Email: @.**@.> Webpage: http://www.tu-dresden.de/cmcb/crtd/

On 7 Oct 2021, at 19:32, theHumanBorch @.**@.>> wrote:

sessioninfo()

annecar commented 2 years ago

Hi again! now i am running into trouble installing the dev version. i get the following error:

devtools::install_github("ncborcherding/scRepertoire@dev") Downloading GitHub repo ncborcherding/scRepertoire@dev ✓ checking for file ‘/private/var/folders/jt/w6858pkd7n752p_f0wqv92sc0000gq/T/RtmpoydW74/remotesbf3674908dc8/ncborcherding-scRepertoire-851c70c/DESCRIPTION’ ... ─ preparing ‘scRepertoire’: ✓ checking DESCRIPTION meta-information ... ─ checking for LF line-endings in source and make files and shell scripts ─ checking for empty or unneeded directories ─ building ‘scRepertoire_1.3.4.tar.gz’

is there something you can do? Thanks a lot annular

ncborcherding commented 2 years ago

Hey Anne,

Dev version now is fixed and installable. Apologies.

I will keep this thread open though as I want to know if this fixes the expression2List() issue.

Thanks, Nick

annecar commented 2 years ago

Hi Nick,

Thanks for fixing the dev version, i managed to install it. Also, using various identities and running either “abundanceContig” or clonalOverlap works without running “expression2List”. which is what you predicted.

Nevertheless, i still run into some issues. Using some identities (or columns form teh metadata. if you like), i can run the two functions perfectly, with some i run into teh following error:

Idents(object = COVPat_TCR_freq) <- "celltype_Patient"

abundance<- abundanceContig(COVPat_TCR_freq, cloneCall = "aa", exportTable = T) Error in df[[i]] : subscript out of bounds

Do you see what is the problem this time?

Thank for your time and help.

All the best Anne

Anne Eugster, PhD Scientist AG Bonifacio

Technische Universität Dresden Center for Molecular and Cellular Bioengineering (CMCB) Center for Regenerative Therapies (CRTD) Fetscherstr. 105 01307 Dresden

Tel: +49 (0)351-458-821 41 Fax: +49 (0)351-458-821 09 Email: @.**@.> Webpage: http://www.tu-dresden.de/cmcb/crtd/

On 14 Oct 2021, at 21:31, theHumanBorch @.**@.>> wrote:

Hey Anne,

Dev version now is fixed and installable. Apologies.

I will keep this thread open though as I want to know if this fixes the expression2List() issue.

Thanks, Nick

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ncborcherding/scRepertoire/issues/116#issuecomment-943662193, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIM7QPWLSEIAA6LSEPKAR7LUG4VYZANCNFSM5FRTKAKA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

ncborcherding commented 2 years ago

Hey Anne,

I just encountered something similar and it might be the underlying problem - do the columns that not work have NAs in them? I am wondering if the NA values are throwing the organization system off.

I will have time this weekend to test the idea more, but wanted to see if you saw that in your data?

Nick

annecar commented 2 years ago

Hey nick, no definitively no NA s , the column is patient info, coupled to cell type, So an ID containing letters and numbers. Thanks for looking into it. Alle the best , Anne

Am 15.10.2021 um 18:29 schrieb theHumanBorch @.***>:



Hey Anne,

I just encountered something similar and it might be the underlying problem - do the columns that not work have NAs in them? I am wondering if the NA values are throwing the organization system off.

I will have time this weekend to test the idea more, but wanted to see if you saw that in your data?

Nick

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ncborcherding/scRepertoire/issues/116#issuecomment-944435536, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIM7QPVQXEIWUVKSHEYDRATUHBJGRANCNFSM5FRTKAKA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

annecar commented 2 years ago

Hi Nick I am not sure my e mail reached you - in anywise her once more - no i do not have NAs in the colmuns, the colmuns contain metadata present in each cell (i.e attributed cell type and a sample ID). Thanks for looking into tis! Best Anne

Anne Eugster, PhD Scientist AG Bonifacio

Technische Universität Dresden Center for Molecular and Cellular Bioengineering (CMCB) Center for Regenerative Therapies (CRTD) Fetscherstr. 105 01307 Dresden

Tel: +49 (0)351-458-821 41 Fax: +49 (0)351-458-821 09 Email: @.**@.> Webpage: http://www.tu-dresden.de/cmcb/crtd/

On 15 Oct 2021, at 18:29, theHumanBorch @.**@.>> wrote:

Hey Anne,

I just encountered something similar and it might be the underlying problem - do the columns that not work have NAs in them? I am wondering if the NA values are throwing the organization system off.

I will have time this weekend to test the idea more, but wanted to see if you saw that in your data?

Nick

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ncborcherding/scRepertoire/issues/116#issuecomment-944435536, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIM7QPVQXEIWUVKSHEYDRATUHBJGRANCNFSM5FRTKAKA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

ncborcherding commented 2 years ago

Hey Anne,

I want to apologize - paternity leave has had me away from work. Today I modified some of the pre-viz checks to hopefully better accomodate varied data. If you get a chance to reinstall the dev version, maybe give it a new go?

If not, if you wouldn't mind emailing me a copy of the expression2List() .rds, I would appreciate getting to the bottom of this.

Thanks, Nick

annecar commented 2 years ago

Hey Nick i have updated again the dev version. Still the same trouble.

I wanted to make the expression2List function, but also this seems buggy. Here is what i did:

COVPat_Patient <- expression2List(COVPat_TCR_freq, split.by = "celltype_Patient")

head(COVPat_Patient) $Proliferating1 Error in as.character.factor(x) : malformed factor

Any clue?? Thanks Anne

Anne Eugster, PhD Scientist AG Bonifacio

Technische Universität Dresden Center for Molecular and Cellular Bioengineering (CMCB) Center for Regenerative Therapies (CRTD) Fetscherstr. 105 01307 Dresden

Tel: +49 (0)351-458-821 41 Fax: +49 (0)351-458-821 09 Email: @.**@.> Webpage: http://www.tu-dresden.de/cmcb/crtd/

On 27 Oct 2021, at 19:51, theHumanBorch @.**@.>> wrote:

Hey Anne,

I want to apologize - paternity leave has had me away from work. Today I modified some of the pre-viz checks to hopefully better accomodate varied data. If you get a chance to reinstall the dev version, maybe give it a new go?

If not, if you wouldn't mind emailing me a copy of the expression2List() .rds, I would appreciate getting to the bottom of this.

Thanks, Nick

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ncborcherding/scRepertoire/issues/116#issuecomment-953169808, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIM7QPXS6GM5NLOLZTO7CDDUJA3ZXANCNFSM5FRTKAKA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

annecar commented 2 years ago

Hey Nick I have one small request: is there an easy way to export a table with all enriched clonotypes (i.e found in more than 1 cell), their AA and the occurrence in the various samples? Thanks Anne

Anne Eugster, PhD Scientist AG Bonifacio

Technische Universität Dresden Center for Molecular and Cellular Bioengineering (CMCB) Center for Regenerative Therapies (CRTD) Fetscherstr. 105 01307 Dresden

Tel: +49 (0)351-458-821 41 Fax: +49 (0)351-458-821 09 Email: @.**@.> Webpage: http://www.tu-dresden.de/cmcb/crtd/

On 27 Oct 2021, at 19:51, theHumanBorch @.**@.>> wrote:

Hey Anne,

I want to apologize - paternity leave has had me away from work. Today I modified some of the pre-viz checks to hopefully better accomodate varied data. If you get a chance to reinstall the dev version, maybe give it a new go?

If not, if you wouldn't mind emailing me a copy of the expression2List() .rds, I would appreciate getting to the bottom of this.

Thanks, Nick

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ncborcherding/scRepertoire/issues/116#issuecomment-953169808, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIM7QPXS6GM5NLOLZTO7CDDUJA3ZXANCNFSM5FRTKAKA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

ncborcherding commented 2 years ago

Hey Anne,

Sorry to hear it still isn't working, would you mind emailing me a sample of your Seurat Object or the meta data?

In terms of the enrich filter, you could do something like:

summary <- table(seuratObject$CTaa, seuratObject$sample.id)
summary <- summary[rowSums(summary) >=2, ] #only rows with a least 2 clones will be returned

Hope that helps, Nick

annecar commented 2 years ago

can I share the object on Google Drive? It is too large to send.

Anne Eugster, PhD Scientist AG Bonifacio

Technische Universität Dresden Center for Molecular and Cellular Bioengineering (CMCB) Center for Regenerative Therapies (CRTD) Fetscherstr. 105 01307 Dresden

Tel: +49 (0)351-458-821 41 Fax: +49 (0)351-458-821 09 Email: @.**@.> Webpage: http://www.tu-dresden.de/cmcb/crtd/

On 2 Nov 2021, at 17:01, theHumanBorch @.**@.>> wrote:

Hey Anne,

Sorry to hear it still isn't working, would you mind emailing @.***> a sample of your Seurat Object or the meta data?

In terms of the enrich filter, you could do something like:

summary <- table(seuratObject$CTaa, seuratObject$sample.idhttp://sample.id) summary <- summary[rowSums(summary) >=2, ] #only rows with a least 2 clones will be returned

Hope that helps, Nick

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ncborcherding/scRepertoire/issues/116#issuecomment-957879625, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIM7QPTDDV6CEKBHZCMT6IDUKAKOVANCNFSM5FRTKAKA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

ncborcherding commented 2 years ago

Yeah absolutely.

annecar commented 2 years ago

did you get it?

Anne Eugster, PhD Scientist AG Bonifacio

Technische Universität Dresden Center for Molecular and Cellular Bioengineering (CMCB) Center for Regenerative Therapies (CRTD) Fetscherstr. 105 01307 Dresden

Tel: +49 (0)351-458-821 41 Fax: +49 (0)351-458-821 09 Email: @.**@.> Webpage: http://www.tu-dresden.de/cmcb/crtd/

On 2 Nov 2021, at 17:29, theHumanBorch @.**@.>> wrote:

Yeah absolutely.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ncborcherding/scRepertoire/issues/116#issuecomment-957920644, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIM7QPSIM7W46M7EDQGTRQTUKANXTANCNFSM5FRTKAKA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

ncborcherding commented 2 years ago

Not yet, no.

ncborcherding commented 2 years ago

Hey Anne,

Something is up with your data - I wasn't able to use readRDS(), getting an error message:

Error in readRDS("~/Desktop/p1_1d.rds") : unknown input format

but I was loaded using:

load("p1_1d.rds")

I can see the object in my environment, can get RNA assay data, but I cannot view the meta data. Both of these:

data[[]]
data@meta.data

results in a the error you mentioned above:

Error in as.character.factor(x) : malformed factor

So what is happening here is that scRepertoire and specifically anything trying to use the meta data from your Seurat Object (like expression2List()) is not working because there is an error in referencing the meta data. The error itself is likely due to one of the meta data vectors being functioning as a malformed factor (like not all the levels were assigned properly).

Do you get the same errors on your end if you try to save the meta data?

Nick

annecar commented 2 years ago

Yes i do, exactly, i cannot view the metadata. Have you seen this with objects before?

Anne Eugster, PhD Scientist AG Bonifacio

Technische Universität Dresden Center for Molecular and Cellular Bioengineering (CMCB) Center for Regenerative Therapies (CRTD) Fetscherstr. 105 01307 Dresden

Tel: +49 (0)351-458-821 41 Fax: +49 (0)351-458-821 09 Email: @.**@.> Webpage: http://www.tu-dresden.de/cmcb/crtd/

On 2 Nov 2021, at 17:57, theHumanBorch @.**@.>> wrote:

Hey Anne,

Something is up with your data - I wasn't able to use readRDS(), getting an error message:

Error in readRDS("~/Desktop/p1_1d.rds") : unknown input format

but I was loaded using:

load("p1_1d.rds")

I can see the object in my environment, can get RNA assay data, but I cannot view the meta data. Both of these:

data[[]] @.**@.>

results in a the error you mentioned above:

Error in as.character.factor(x) : malformed factor

So what is happening here is that scRepertoire and specifically anything trying to use the meta data from your Seurat Object (like expression2List()) is not working because there is an error in referencing the meta data. The error itself is likely due to one of the meta data vectors being functioning as a malformed factor (like not all the levels were assigned properly).

Do you get the same errors on your end if you try to save the meta data?

Nick

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ncborcherding/scRepertoire/issues/116#issuecomment-957944459, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIM7QPVNQATEAHKXWTQJ4ALUKARABANCNFSM5FRTKAKA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

ncborcherding commented 2 years ago

Hey Anne,

No i have never actually seen this before!

Looking at you data thought I wrote the following code that fixes the issue. There were issues with the variables - "mild_sample", "severe_sample" and "critical_sample". I do not know exactly what they show because I literally cannot access them, but I can remove them and then re-add the meta data minus those 3 columns:

meta <- slot(p1_1d, "meta.data")
x <- sapply(meta, is.factor)
x <- colnames(meta)[x]

for (i in seq_along(x)) {
  print(meta[,x[i]][1:100])
  levels(meta[,x[i]])
}

#Errors out on loop 19 or "mild_sample"
meta <- meta[,-which(colnames(meta) == "mild_sample")]

x <- sapply(meta, is.factor)
x <- colnames(meta)[x]

for (i in seq_along(x)) {
  print(meta[,x[i]][1:100])
  levels(meta[,x[i]])
}

meta <- meta[,-which(colnames(meta) == "severe_sample")]

x <- sapply(meta, is.factor)
x <- colnames(meta)[x]

for (i in seq_along(x)) {
  print(meta[,x[i]][1:100])
  levels(meta[,x[i]])
}

meta <- meta[,-which(colnames(meta) == "critical_sample")]

x <- sapply(meta, is.factor)
x <- colnames(meta)[x]

for (i in seq_along(x)) {
  print(meta[,x[i]][1:100])
  levels(meta[,x[i]])
}

slot(p1_1d, "meta.data") <- meta
saveRDS(p1_1d, file = "p1_1d.rds")

Also thanks for getting me your data - I went ahead and deleted your link in the comment so no one else has access.

Hopefully, this works more smoothly, I was able to call expression2List() after the above code with no issue.

Nick

annecar commented 2 years ago

Thanks so much, Nick, for taking your time. I will have a closer look tomorrow, now is too late… I will get back to you. Thanks again so much. anne

Anne Eugster, PhD Scientist AG Bonifacio

Technische Universität Dresden Center for Molecular and Cellular Bioengineering (CMCB) Center for Regenerative Therapies (CRTD) Fetscherstr. 105 01307 Dresden

Tel: +49 (0)351-458-821 41 Fax: +49 (0)351-458-821 09 Email: @.**@.> Webpage: http://www.tu-dresden.de/cmcb/crtd/

On 2 Nov 2021, at 18:22, theHumanBorch @.**@.>> wrote:

Hey Anne,

No i have never actually seen this before!

Looking at you data thought I wrote the following code that fixes the issue. There were issues with the variables - "mild_sample", "severe_sample" and "critical_sample". I do not know exactly what they show because I literally cannot access them, but I can remove them and then re-add the meta data minus those 3 columns:

meta <- slot(p1_1d, "meta.data") x <- sapply(meta, is.factor) x <- colnames(meta)[x]

for (i in seq_along(x)) { print(meta[,x[i]][1:100]) levels(meta[,x[i]]) }

Errors out on loop 19 or "mild_sample"

meta <- meta[,-which(colnames(meta) == "mild_sample")]

x <- sapply(meta, is.factor) x <- colnames(meta)[x]

for (i in seq_along(x)) { print(meta[,x[i]][1:100]) levels(meta[,x[i]]) }

meta <- meta[,-which(colnames(meta) == "severe_sample")]

x <- sapply(meta, is.factor) x <- colnames(meta)[x]

for (i in seq_along(x)) { print(meta[,x[i]][1:100]) levels(meta[,x[i]]) }

meta <- meta[,-which(colnames(meta) == "critical_sample")]

x <- sapply(meta, is.factor) x <- colnames(meta)[x]

for (i in seq_along(x)) { print(meta[,x[i]][1:100]) levels(meta[,x[i]]) }

slot(p1_1d, "meta.data") <- meta saveRDS(p1_1d, file = "p1_1d.rds")

Also thanks for getting me your data - I went ahead and deleted your link in the comment so no one else has access.

Hopefully, this works more smoothly, I was able to call expression2List() after the above code with no issue.

Nick

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ncborcherding/scRepertoire/issues/116#issuecomment-957965995, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIM7QPSPXQFNSLXX7C2QN5DUKAT7TANCNFSM5FRTKAKA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

annecar commented 2 years ago

Hi Nick Thanks a lot again, it all worked perfectly. I can now access the meta data run expresson2List. Nevetherless, using some meta data to output tsome functions (as abundance or clonal diversity) I encounter teh Error : subscript out of bonds.

clonalOverlap(COVPat_TCR_freq, cloneCall="aa", method="overlap") Error in df[[i]] : subscript out of bounds

is that because there are just too many meta data types (it would split the data into 48 types)?

And further, i am wondering if you plan to build in also the Hill’s numbers?

Thanks you again and have a good week-end Best Anne

Anne Eugster, PhD Scientist AG Bonifacio

Technische Universität Dresden Center for Molecular and Cellular Bioengineering (CMCB) Center for Regenerative Therapies (CRTD) Fetscherstr. 105 01307 Dresden

Tel: +49 (0)351-458-821 41 Fax: +49 (0)351-458-821 09 Email: @.**@.> Webpage: http://www.tu-dresden.de/cmcb/crtd/

On 2 Nov 2021, at 18:22, theHumanBorch @.**@.>> wrote:

Hey Anne,

No i have never actually seen this before!

Looking at you data thought I wrote the following code that fixes the issue. There were issues with the variables - "mild_sample", "severe_sample" and "critical_sample". I do not know exactly what they show because I literally cannot access them, but I can remove them and then re-add the meta data minus those 3 columns:

meta <- slot(p1_1d, "meta.data") x <- sapply(meta, is.factor) x <- colnames(meta)[x]

for (i in seq_along(x)) { print(meta[,x[i]][1:100]) levels(meta[,x[i]]) }

Errors out on loop 19 or "mild_sample"

meta <- meta[,-which(colnames(meta) == "mild_sample")]

x <- sapply(meta, is.factor) x <- colnames(meta)[x]

for (i in seq_along(x)) { print(meta[,x[i]][1:100]) levels(meta[,x[i]]) }

meta <- meta[,-which(colnames(meta) == "severe_sample")]

x <- sapply(meta, is.factor) x <- colnames(meta)[x]

for (i in seq_along(x)) { print(meta[,x[i]][1:100]) levels(meta[,x[i]]) }

meta <- meta[,-which(colnames(meta) == "critical_sample")]

x <- sapply(meta, is.factor) x <- colnames(meta)[x]

for (i in seq_along(x)) { print(meta[,x[i]][1:100]) levels(meta[,x[i]]) }

slot(p1_1d, "meta.data") <- meta saveRDS(p1_1d, file = "p1_1d.rds")

Also thanks for getting me your data - I went ahead and deleted your link in the comment so no one else has access.

Hopefully, this works more smoothly, I was able to call expression2List() after the above code with no issue.

Nick

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ncborcherding/scRepertoire/issues/116#issuecomment-957965995, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIM7QPSPXQFNSLXX7C2QN5DUKAT7TANCNFSM5FRTKAKA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

ncborcherding commented 2 years ago

Hey Anne,

Apologize for being so late - my first child was born on Nov 5th and it has been a whirlwind since. Just getting back to it.

What are you using to call expression2List()? Can you show me? I am wondering if it is the variable you are using to split the conti list is somehow affecting this.

I was able to get clonalOverlap() to work using:

x <- expression2List(p1_1d, split.by = "new.cluster.ids")
clonalOverlap(x, cloneCall="aa", method="overlap")

In terms of the Hill number - do you have a good reference to read up on it? I am less familiar with it, but it would be an easy addition to the clonalDiversity() function I think.

Thanks, Nick

annecar commented 2 years ago

Hi Nick

Good news, congratulations for becoming a dad! I wish you all the bets and a nice start with your family, enjoy! Thank you very much for taking care of my problems despite the important events in your life!

Regarding the variable i am using, it is defined as follows:

@.**@.>$celltype_Patient <- @.**@.$new.cluster.ids>, @.**@.>$Patient)

then i selected the variable :

Idents(object = COVPat_TCR_freq) <- "celltype_Patient" quantContig(COVPat_TCR_freq, cloneCall="aa", scale = T)

then the error appears:

Error in df[[i]] : subscript out of bounds

May be you have an idea…

Regarding the HIll’s number: please see a publication by Victor Greiff attached. You will find all you need in there.

Thanks a lot for looking into all these things!

All the best for now Anne

Anne Eugster, PhD Scientist AG Bonifacio

Technische Universität Dresden Center for Molecular and Cellular Bioengineering (CMCB) Center for Regenerative Therapies (CRTD) Fetscherstr. 105 01307 Dresden

Tel: +49 (0)351-458-821 41 Fax: +49 (0)351-458-821 09 Email: @.**@.> Webpage: http://www.tu-dresden.de/cmcb/crtd/

On 1 Dec 2021, at 00:56, theHumanBorch @.**@.>> wrote:

Hey Anne,

Apologize for being so late - my first child was born on Nov 5th and it has been a whirlwind since. Just getting back to it.

What are you using to call expression2List()? Can you show me? I am wondering if it is the variable you are using to split the conti list is somehow affecting this.

I was able to get clonalOverlap() to work using:

x <- expression2List(p1_1d, split.by = "new.cluster.ids") clonalOverlap(x, cloneCall="aa", method="overlap")

In terms of the Hill number - do you have a good reference to read up on it? I am less familiar with it, but it would be an easy addition to the clonalDiversity() function I think.

Thanks, Nick

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ncborcherding/scRepertoire/issues/116#issuecomment-983134578, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIM7QPXSK65XEU2NSCLZTO3UOVQEVANCNFSM5FRTKAKA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

ncborcherding commented 2 years ago

Hey Anne,

Very strange - I am able to call the functions using your code above:

Screen Shot 2021-12-01 at 4 59 56 AM

I know I am only getting p1_1d of the total data - is there any issues with viewing the levels of "celltype_Patient" within Seurat? Similar to the problems above? Or any NAs (NAs should be handled but I guess might be an issue).

Thanks, Nick

annecar commented 2 years ago

HI again, no i can view teh levels without any problem.

levels(COVPat_TCR_freq) [1] "T Effector Memory1" "T Effector-like1" "T Central Memory1" "Proliferating1" "T Effector-like9" [6] "T Central Memory9" "T Effector Memory9" "Proliferating9" "T Effector-like10" "T Effector-like11" [11] "T Central Memory10" "T Central Memory11" "T Effector Memory11" "Proliferating10" "T Effector Memory10" [16] "Proliferating11" "T Effector-like2" "T Effector Memory2" "T Central Memory2" "Proliferating2" [21] "T Effector-like3" "T Effector Memory3" "T Central Memory3" "Proliferating3" "Proliferating4" [26] "T Central Memory4" "T Effector Memory4" "T Effector-like4" "T Central Memory5" "T Effector Memory5" [31] "T Effector-like5" "Proliferating5" "T Effector-like6" "T Central Memory6" "T Effector Memory6" [36] "Proliferating6" "T Effector-like7" "T Central Memory7" "T Effector Memory7" "Proliferating7" [41] "T Central Memory8" "T Effector-like8" "T Effector Memory8" "Proliferating8”

should be ok? !

Anne Eugster, PhD Scientist AG Bonifacio

Technische Universität Dresden Center for Molecular and Cellular Bioengineering (CMCB) Center for Regenerative Therapies (CRTD) Fetscherstr. 105 01307 Dresden

Tel: +49 (0)351-458-821 41 Fax: +49 (0)351-458-821 09 Email: @.**@.> Webpage: http://www.tu-dresden.de/cmcb/crtd/

On 1 Dec 2021, at 12:06, theHumanBorch @.**@.>> wrote:

Hey Anne,

Very strange - I am able to call the functions using your code above: [Screen Shot 2021-12-01 at 4 59 56 AM]https://user-images.githubusercontent.com/22754118/144223160-a619743c-cc4d-41dc-9379-4c416a5e4b5f.png

I know I am only getting p1_1d of the total data - is there any issues with viewing the levels of "celltype_Patient" within Seurat? Similar to the problems above? Or any NAs (NAs should be handled but I guess might be an issue).

Thanks, Nick

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ncborcherding/scRepertoire/issues/116#issuecomment-983530295, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIM7QPQTRI7RDFKF3SDFZUDUOX6RXANCNFSM5FRTKAKA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

ncborcherding commented 2 years ago

Hey Anne,

It looks like ok to me - quick question, what is COVPat_TCR_freq? Is it a seurat or a list object? I will try to recreate the problem from my end with the sample data.

annecar commented 2 years ago

Hi Nick,

it is a Seurat Object:

COVPat_TCR_freq An object of class Seurat 17802 features across 56334 samples within 1 assay Active assay: RNA (17802 features, 0 variable features) 2 dimensional reductions calculated: pca, umap

Does that make sense?

Anne Eugster, PhD Scientist AG Bonifacio

Technische Universität Dresden Center for Molecular and Cellular Bioengineering (CMCB) Center for Regenerative Therapies (CRTD) Fetscherstr. 105 01307 Dresden

Tel: +49 (0)351-458-821 41 Fax: +49 (0)351-458-821 09 Email: @.**@.> Webpage: http://www.tu-dresden.de/cmcb/crtd/

On 1 Dec 2021, at 19:04, theHumanBorch @.**@.>> wrote:

Hey Anne,

It looks like ok to me - quick question, what is COVPat_TCR_freq? Is it a seurat or a list object? I will try to recreate the problem from my end with the sample data.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ncborcherding/scRepertoire/issues/116#issuecomment-983916388, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIM7QPUYNPDG25GUNFXU533UOZPRDANCNFSM5FRTKAKA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

annecar commented 2 years ago

Hi Nick a quick question: on your biocundctor page http://bioconductor.org/packages/release/bioc/vignettes/scRepertoire/inst/doc/vignette.html you show scReprtoire version 1.4., but it can not be installed yet, i tried both, @.***")

and devtools::install_github("ncborcherding/scRepertoire”)

Is that jus a matter of time for the 1.4 to be available? Thanks!! anne

Anne Eugster, PhD Scientist AG Bonifacio

Technische Universität Dresden Center for Molecular and Cellular Bioengineering (CMCB) Center for Regenerative Therapies (CRTD) Fetscherstr. 105 01307 Dresden

Tel: +49 (0)351-458-821 41 Fax: +49 (0)351-458-821 09 Email: @.**@.> Webpage: http://www.tu-dresden.de/cmcb/crtd/

On 1 Dec 2021, at 19:04, theHumanBorch @.**@.>> wrote:

Hey Anne,

It looks like ok to me - quick question, what is COVPat_TCR_freq? Is it a seurat or a list object? I will try to recreate the problem from my end with the sample data.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ncborcherding/scRepertoire/issues/116#issuecomment-983916388, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIM7QPUYNPDG25GUNFXU533UOZPRDANCNFSM5FRTKAKA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

annecar commented 2 years ago

Hi Nick! Happy new year first of all! I hope you had a good start.

I would like to export my clonotyopes data, grouped and with counts, with aa AND gene info; i thought of using abundanceContig, but this only allows me to export using "cloneCall = “aa”” or cloneCall = “gene + nucleotide. But what i need is gene and aa. is there an easy answer to this?

Sorry agin to bother you, and thanks you very much for your help!! Anne

Anne Eugster, PhD Scientist AG Bonifacio

Technische Universität Dresden Center for Molecular and Cellular Bioengineering (CMCB) Center for Regenerative Therapies (CRTD) Fetscherstr. 105 01307 Dresden

Tel: +49 (0)351-458-821 41 Fax: +49 (0)351-458-821 09 Email: @.*** Webpage: http://www.tu-dresden.de/cmcb/crtd/

On 1 Dec 2021, at 20:39, Eugster, Anne @.***> wrote:

Hi Nick,

it is a Seurat Object:

COVPat_TCR_freq An object of class Seurat 17802 features across 56334 samples within 1 assay Active assay: RNA (17802 features, 0 variable features) 2 dimensional reductions calculated: pca, umap

Does that make sense?

Anne Eugster, PhD Scientist AG Bonifacio

Technische Universität Dresden Center for Molecular and Cellular Bioengineering (CMCB) Center for Regenerative Therapies (CRTD) Fetscherstr. 105 01307 Dresden

Tel: +49 (0)351-458-821 41 Fax: +49 (0)351-458-821 09 Email: @. @.> Webpage: http://www.tu-dresden.de/cmcb/crtd/ http://www.tu-dresden.de/cmcb/crtd/

On 1 Dec 2021, at 19:04, theHumanBorch @. @.>> wrote:

Hey Anne,

It looks like ok to me - quick question, what is COVPat_TCR_freq? Is it a seurat or a list object? I will try to recreate the problem from my end with the sample data.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ncborcherding/scRepertoire/issues/116#issuecomment-983916388, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIM7QPUYNPDG25GUNFXU533UOZPRDANCNFSM5FRTKAKA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

ncborcherding commented 2 years ago

Hey Anne,

Sorry this thread got away from me - hope the New Year is good for you as well.

  1. The bioconductor versioning is weird and although I have not commited a new version, the version switched to 1.4. The current GitHub version (1.3.5 - both master and dev) actually has way more updates

  2. In terms of your question on summarizing - this is how I would do it:

library(dplyr)
bound <- bind_rows(combined)

summary <- bound %>%
    group_by(sample, ID, CTaa, CTstrict) %>%
    count()

Let me know if you have any other questions, Nick

annecar commented 2 years ago

Good morning Nick, Thanks a lot for your answer.

-Ok, all clear about the version.

-As for the export: your suggestion works fine with the combined object, but not with the Seurat Object, obviously, as it is not a data frame. This means that i can export some of the metadata, but not the cell type info gained in the Seurat analysis. Any idea there?

-Also, and may be people have asked you that already, i am wondering whether you are also working on something like TCRdist, (https://tcrdist3.readthedocs.io/en/latest/welcome.html https://tcrdist3.readthedocs.io/en/latest/welcome.html), allowing to group closely related clonotypes in a similarity network? The reason i wantto export my data is to run this… It would be a nice addition to have something similar in scRepertoire. Allowing to group TCRs and to predict specificities by database searches.

Thanks again for all your help.

Anne

Anne Eugster, PhD Scientist AG Bonifacio

Technische Universität Dresden Center for Molecular and Cellular Bioengineering (CMCB) Center for Regenerative Therapies (CRTD) Fetscherstr. 105 01307 Dresden

Tel: +49 (0)351-458-821 41 Fax: +49 (0)351-458-821 09 Email: @.*** Webpage: http://www.tu-dresden.de/cmcb/crtd/

On 5 Jan 2022, at 21:01, theHumanBorch @.***> wrote:

Hey Anne,

Sorry this thread got away from me - hope the New Year is good for you as well.

The bioconductor versioning is weird and although I have not commited a new version, the version switched to 1.4. The current GitHub version (1.3.5 - both master and dev) actually has way more updates

In terms of your question on summarizing - this is how I would do it:

library(dplyr) bound <- bind_rows(combined)

summary <- bound %>% group_by(sample, ID, CTaa, CTstrict) %>% count() Let me know if you have any other questions, Nick

— Reply to this email directly, view it on GitHub https://github.com/ncborcherding/scRepertoire/issues/116#issuecomment-1006035877, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIM7QPXDZ27V3LCW2BAZBQLUUSPQDANCNFSM5FRTKAKA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you authored the thread.

ncborcherding commented 2 years ago

Hey Anne,

As for the export: your suggestion works fine with the combined object, but not with the Seurat Object, obviously, as it is not a data frame. This means that i can export some of the metadata, but not the cell type info gained in the Seurat analysis. Any idea there?

My bad - you can use the same approach using the Seurat meta data

 library(dplyr)
meta <- Seurat.Obj[[]]

 summary <- meta %>%
     group_by(meta vraiables...., CTaa, CTstrict) %>%
     count()

Also, and may be people have asked you that already, i am wondering whether you are also working on something like TCRdist, (https://tcrdist3.readthedocs.io/en/latest/welcome.html https://tcrdist3.readthedocs.io/en/latest/welcome.html), allowing to group closely related clonotypes in a similarity network?

So there is an edit distance-based clustering function - clusterTCR() that has a similar architecture to tcrdist in terms of approach.

Another option is to check out the new package I am about to submit a preprint on called Trex. This is a multilayered approach to vectorization of TCR sequences and can then be used to clusterTCRs. You can use edit distance as one of the layers, there is also an autoencoder, similar to TESSA

In the interest of other users - I am going to close this issue. I was never able to replicate the data structure error, outside of the initial filtering. Please feel free to continue comments here or via email @ ncborch@gmail.com.

annecar commented 2 years ago

Hi Nick Perfect, that made it!!

Your suggestions are very good, i will first explore teh function you implemented in scRepertoire and then also look at your Trex. I might get back to you in case i come upon some problems, but I will leave you in peace for the moment! Thanks again for all! anne

Anne Eugster, PhD Scientist AG Bonifacio

Technische Universität Dresden Center for Molecular and Cellular Bioengineering (CMCB) Center for Regenerative Therapies (CRTD) Fetscherstr. 105 01307 Dresden

Tel: +49 (0)351-458-821 41 Fax: +49 (0)351-458-821 09 Email: @.*** Webpage: http://www.tu-dresden.de/cmcb/crtd/

On 6 Jan 2022, at 15:40, theHumanBorch @.***> wrote:

Hey Anne,

As for the export: your suggestion works fine with the combined object, but not with the Seurat Object, obviously, as it is not a data frame. This means that i can export some of the metadata, but not the cell type info gained in the Seurat analysis. Any idea there?

My bad - you can use the same approach using the Seurat meta data

library(dplyr) meta <- Seurat.Obj[[]]

summary <- meta %>% group_by(meta vraiables...., CTaa, CTstrict) %>% count() Also, and may be people have asked you that already, i am wondering whether you are also working on something like TCRdist, (https://tcrdist3.readthedocs.io/en/latest/welcome.html https://tcrdist3.readthedocs.io/en/latest/welcome.html https://tcrdist3.readthedocs.io/en/latest/welcome.html https://tcrdist3.readthedocs.io/en/latest/welcome.html), allowing to group closely related clonotypes in a similarity network?

So there is an edit distance-based clustering function - clusterTCR() that has a similar architecture to tcrdist in terms of approach.

Another option is to check out the new package I am about to submit a preprint on called Trex https://github.com/ncborcherding/Trex/tree/dev. This is a multilayered approach to vectorization of TCR sequences and can then be used to clusterTCRs. You can use edit distance as one of the layers, there is also an autoencoder, similar to TESSA https://github.com/jcao89757/TESSA In the interest of other users - I am going to close this issue. I was never able to replicate the data structure error, outside of the initial filtering. Please feel free to continue comments here or via email @ @. @.>.

— Reply to this email directly, view it on GitHub https://github.com/ncborcherding/scRepertoire/issues/116#issuecomment-1006642388, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIM7QPUBGS6VWVWXGWLOUI3UUWSUXANCNFSM5FRTKAKA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you authored the thread.