load_vdj TypeError - Githubissues

bio-liucheng commented 4 years ago

Hi developer:

I got this error when I loaded TCR info into adata.

adata = pyvdj.load_vdj(samples, adata, obs_col='vdj_obs', cellranger=3) Traceback (most recent call last): File "", line 1, in File "C:\Users\L\Downloads\pyVDJ-0.1.1\pyvdj\load_vdj.py", line 66, in load_vdj cat_df['productive_all'].replace(to_replace=product_dict, inplace=True) File "C:\ProgramData\Miniconda3\lib\site-packages\pandas\core\series.py", line 4178, in replace method=method, File "C:\ProgramData\Miniconda3\lib\site-packages\pandas\core\generic.py", line 6646, in replace to_replace, value, inplace=inplace, limit=limit, regex=regex File "C:\ProgramData\Miniconda3\lib\site-packages\pandas\core\series.py", line 4178, in replace method=method, File "C:\ProgramData\Miniconda3\lib\site-packages\pandas\core\generic.py", line 6699, in replace regex=regex, File "C:\ProgramData\Miniconda3\lib\site-packages\pandas\core\internals\managers.py", line 613, in replace_list masks = [comp(s, regex) for i, s in enumerate(src_list)] File "C:\ProgramData\Miniconda3\lib\site-packages\pandas\core\internals\managers.py", line 613, in masks = [comp(s, regex) for i, s in enumerate(src_list)] File "C:\ProgramData\Miniconda3\lib\site-packages\pandas\core\internals\managers.py", line 611, in comp return _compare_or_regex_search(values, s, regex) File "C:\ProgramData\Miniconda3\lib\site-packages\pandas\core\internals\managers.py", line 1936, in _compare_or_regex_search f"Cannot compare types {repr(type_names[0])} and {repr(type_names[1])}" TypeError: Cannot compare types 'ndarray(dtype=object)' and 'str'

How can I fix it? I have tried pyVDJ-0.1.1 & pyVDJ-0.1.2, but have the same error.

By the way. In cellranger V3, some cells contain more than two chains. Even cells have paired productive TRA/TRB chains, they may have one more productive or non-productive chain. Dose these cells work normally and treat it as productive cells?

Thank you.

Cheng

veghp commented 4 years ago

Either something is wrong with the input data, or pandas behaviour changed (again). pandas v1.0.0 came out recently and they may have changed this function. Perhaps you can try installing pandas v0.25.1 in your environment (and use pyVDJ 0.1.2) and see if it works. https://github.com/veghp/pyVDJ#dependencies

As for the second question, there are two related annotations: all_productive and any_productive. These are not used by pyVDJ for filtering cells. Also there is no data on which chains are part of the same TCR. So these annotations are more useful as indicators: some cell types may have only nonproductive chains.

nicoac commented 4 years ago

I'm having what I think is a similar issue. I've used pandasv0.25.1 and pyVDJ 0.1.2 and it doesn't make a difference as far as I can tell.

n [53]: adata = pyvdj.load_vdj(samples, adata, obs_col='vdj_obs', cellranger=3)
Traceback (most recent call last):

  File "<ipython-input-53-27eb31acede1>", line 1, in <module>
    adata = pyvdj.load_vdj(samples, adata, obs_col='vdj_obs', cellranger=3)

  File "C:\Anaconda\lib\site-packages\pyvdj\load_vdj.py", line 87, in load_vdj
    if adata == None:

  File "C:\Anaconda\lib\site-packages\anndata\_core\anndata.py", line 544, in __eq__
    "Equality comparisons are not supported for AnnData objects, "

NotImplementedError: Equality comparisons are not supported for AnnData objects, instead compare the desired attributes.

But interestingly the new adata object gets a vdj_obs in the obs section:

adata
Out[54]: 
AnnData object with n_obs × n_vars = 50019 × 1945 
    obs: 'sample_id', 'n_genes', 'percent_mito', 'n_counts', 'leiden', 'vdj_obs'
    var: 'gene_ids', 'feature_types', 'n_cells', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'leiden', 'neighbors', 'pca', 'umap', 'leiden_colors', 'rank_genes_groups'
    obsm: 'X_pca', 'X_umap'
    varm: 'PCs'

Any advice?

veghp commented 4 years ago

Unfortunately I've never come across this error. The error is raised by the anndata package, perhaps that changed too. The vdj_obs is not added by load_vdj(), but by you: see 'Prepare metadata column' section in the tutorial.

veghp commented 4 years ago

I may try and update pyVDJ sometime, but perhaps in the meantime you may want to look at scirpy, an another package for analyzing TCR data with scanpy: https://icbi-lab.github.io/scirpy/

Zifeng-L commented 4 years ago

我可能会尝试更新pyVDJ，但是与此同时，您可能想看看scirpy，这是另一个用scanpy分析TCR数据的软件包：https ://icbi-lab.github.io/scirpy/

I had the same issue! Can you provide the example data so that maybe we can find the difference?

rhml7 commented 4 years ago

Hi there. I had the same issue as above, after pyvdj.load_vdj (samples, adata, obs_col='vdj_obs', cellranger=3). vdj_objs are inside the adata object, with the following row: barcode-1_sampleid

Error shown below:

NotImplementedError: Equality comparisons are not supported for AnnData objects, instead compare the desired attributes.

veghp commented 4 years ago

Sorry for the late reply, I did not get notification for these comments. Thank you @juhaa for fixing this.

Please install the latest version from Github: pip install git+https://github.com/veghp/pyVDJ.git

veghp / pyVDJ

load_vdj TypeError #4