phbradley / conga

Clonotype Neighbor Graph Analysis
MIT License
80 stars 18 forks source link

running Conga on integrated Seurat Gex #61

Open SamWell16 opened 1 year ago

SamWell16 commented 1 year ago

I am trying to run Conga on my integrated Seurat object. I already downsampled and integrated 4 CITEseq data in a Seurat object. I have TCR csv file for each sample separatly. I tried to use your mege cod (python ~/conga/scripts/merge_samples.py), but I recived the error below: "Reducing to the 0 barcodes (out of 3971) with paired TCR sequence data" which follows with a few more errors.

When I ran Conga for individual samples, it works with no problem. It seems as part of merging TCR csv files to creat tsv file, the clones are not matching with Gex data. I would really appreciate if you could help to fix this. I am happy to provide more detailed info if needed.

Here is the full error:

anndata 0.9.1 scanpy 1.9.3

PIL 9.5.0 conga NA cycler 0.10.0 cython_runtime NA dateutil 2.8.2 google NA h5py 3.6.0 igraph 0.10.4 importlib_resources NA joblib 1.2.0 kiwisolver 1.4.4 leidenalg 0.9.1 llvmlite 0.40.0 matplotlib 3.7.1 mpl_toolkits NA natsort 8.3.1 numba 0.57.0 numpy 1.23.2 packaging 23.1 pandas 2.0.1 patsy 0.5.3 pyparsing 3.0.9 pytz 2023.3 scipy 1.10.1 session_info 1.0.0 setuptools 67.7.2 six 1.16.0 sklearn 1.2.2 statsmodels 0.14.0 texttable 1.6.7 threadpoolctl 3.1.0 typing_extensions NA wcwidth 0.2.5 zipp NA

Python 3.8.16 | packaged by conda-forge | (default, Feb 1 2023, 16:01:13) [Clang 14.0.6 ] macOS-13.3.1-arm64-arm-64bit

Session information updated at 2023-05-19 00:40 reading: merged_BAL_gex.h5ad of type h5ad /Users/sajadpro/conga/conga/preprocess.py:233: DeprecationWarning: Use is_view instead of isview, isview will be removed in the future. if adata.isview: # ran into trouble with AnnData views vs copies total barcodes: 3971 (3971, 17795) reading: merged_BAL_clones.tsv reading: merged_BAL_clones_AB.dist_50_kpcs Reducing to the 0 barcodes (out of 3971) with paired TCR sequence data /Users/sajadpro/Library/r-miniconda-arm64/envs/r-reticulate/lib/python3.8/site-packages/anndata/_core/anndata.py:117: ImplicitModificationWarning: Transforming to str index. warnings.warn("Transforming to str index.", ImplicitModificationWarning) Traceback (most recent call last): File "/Users/sajadpro/conga/scripts/run_conga.py", line 373, in adata = conga.preprocess.read_dataset( File "/Users/sajadpro/conga/conga/preprocess.py", line 405, in read_dataset store_tcrs_in_adata( adata, tcrs ) File "/Users/sajadpro/conga/conga/preprocess.py", line 180, in store_tcrs_in_adata adata.obs['cdr3a_nucseq'] = adata.obs.cdr3a_nucseq.str.lower() File "/Users/sajadpro/Library/r-miniconda-arm64/envs/r-reticulate/lib/python3.8/site-packages/pandas/core/generic.py", line 5989, in getattr return object.getattribute(self, name) File "/Users/sajadpro/Library/r-miniconda-arm64/envs/r-reticulate/lib/python3.8/site-packages/pandas/core/accessor.py", line 224, in get accessor_obj = self._accessor(obj) File "/Users/sajadpro/Library/r-miniconda-arm64/envs/r-reticulate/lib/python3.8/site-packages/pandas/core/strings/accessor.py", line 181, in init self._inferred_dtype = self._validate(data) File "/Users/sajadpro/Library/r-miniconda-arm64/envs/r-reticulate/lib/python3.8/site-packages/pandas/core/strings/accessor.py", line 235, in _validate raise AttributeError("Can only use .str accessor with string values!") AttributeError: Can only use .str accessor with string values! (conga_new_env) Sajad-Pro:tcrdist_cpp sajadpro$

sschattgen commented 1 year ago

Hi, You're correct that the issue is due to a misalignment between barcodes in TCR and GEX. There was likely some sort of alteration in the GEX barcode during integration (e.g. prefix added or additional suffix). Take a look at conga.tcrdist.make_10x_clone_file.make_10x_clone_file_batch function. It was intended for the purpose of adjusting barcodes in cases like yours.

This issue covers most of the details: https://github.com/phbradley/conga/issues/28#issuecomment-943512846

Stefan