Closed PattF closed 5 years ago
Are you using a conda environment? And have you installed via PyPI? For some reason, it tried to get the version from git..
Yep, using anaconda in windows, ran pip install scvelo
and then that error came up when trying to import at the start of my notebook. Am using scanpy as well and am not encountering this error. Just uninstalled and reinstalled and same error again. Thoughts?
hmm.. could not reproduce that on any windows machine that's lying around here. Would you try it out on a new clean python >= 3.6 environment and see whether you get the same issue?
@stefanpeidli maybe you could take a look as well?
Unfortunately still getting the same issue, other possible work arounds?
We have updated that part of the code. Could you please install the new version of scvelo and try if it works now?
*from source (not yet on PyPI)
Great, thanks guys. I tried installing from source but now get the following error, thoughts?
pip install git+https://github.com/theislab/scvelo
Collecting git+https://github.com/theislab/scvelo
Cloning https://github.com/theislab/scvelo to c:\users\patty\appdata\local\temp\pip-req-build-m90fcbqy
Installing build dependencies ... done
Getting requirements to build wheel ... done
Installing backend dependencies ... done
Preparing wheel metadata ... error
Complete output from command c:\users\patty\anaconda3\python.exe c:\users\patty\anaconda3\lib\site-packages\pip\_vendor\pep517\_in_process.py prepare_metadata_for_build_wheel C:\Users\Patty\AppData\Local\Temp\tmpevsyi6yd:
The file description seems not to be valid rst for PyPI; it will be interpreted as plain text
<string>:: (WARNING/2) No MathJax URL specified, using local fallback (see config.html)
Traceback (most recent call last):
File "c:\users\patty\anaconda3\lib\site-packages\pip\_vendor\pep517\_in_process.py", line 207, in <module>
main()
File "c:\users\patty\anaconda3\lib\site-packages\pip\_vendor\pep517\_in_process.py", line 197, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
File "c:\users\patty\anaconda3\lib\site-packages\pip\_vendor\pep517\_in_process.py", line 69, in prepare_metadata_for_build_wheel
return hook(metadata_directory, config_settings)
File "C:\Users\Patty\AppData\Local\Temp\pip-build-env-30tqm2b8\overlay\Lib\site-packages\flit\buildapi.py", line 27, in prepare_metadata_for_build_wheel
metadata = make_metadata(module, ini_info)
File "C:\Users\Patty\AppData\Local\Temp\pip-build-env-30tqm2b8\overlay\Lib\site-packages\flit\common.py", line 302, in make_metadata
md_dict.update(get_info_from_module(module))
File "C:\Users\Patty\AppData\Local\Temp\pip-build-env-30tqm2b8\overlay\Lib\site-packages\flit\common.py", line 120, in get_info_from_module
version = check_version(version)
File "C:\Users\Patty\AppData\Local\Temp\pip-build-env-30tqm2b8\overlay\Lib\site-packages\flit\common.py", line 146, in check_version
version = normalise_version(version)
File "C:\Users\Patty\AppData\Local\Temp\pip-build-env-30tqm2b8\overlay\Lib\site-packages\flit\validate.py", line 325, in normalise_version
.format(orig_version))
flit.common.InvalidVersion: Version number "Version(release='0.1.16', dev='33', labels=['3f5da63'])" does not match PEP 440 rules
----------------------------------------
Command "c:\users\patty\anaconda3\python.exe c:\users\patty\anaconda3\lib\site-packages\pip\_vendor\pep517\_in_process.py prepare_metadata_for_build_wheel C:\Users\Patty\AppData\Local\Temp\tmpevsyi6yd" failed with error code 1 in C:\Users\Patty\AppData\Local\Temp\pip-req-build-m90fcbqy
Will check on that. For now, you could just
git clone https://github.com/theislab/scvelo.git
cd scvelo
python setup.py install
Oki, tried that and then unfortunately got the following error:
AttributeError Traceback (most recent call last)
We'll have a look into that. In the meanwhile, have you thought about moving to linux? We haven't done our implementation primarily for windows, thus lacking compatibility.
Thanks for that, and completely understandable. I sure have, and will try making the switch over as soon as I finish off my thesis and have more time :)
Just looked again into your issue. I could not find the python version you're using. Looks like you are running on root.
Create a new environment conda create -n py36 python=3.6
, activate it via conda activate py36
. install pytables with conda install pytables
and scvelo with pip install scvelo
and try again.
Hi Volker,
Sorry for the late reply, I'm away travelling at the moment. Thanks for following up, I'll attempt this over the weekend and let you know how it goes. Thanks!
Hey Volker, Creating the new environment seemed to work fine when installing pytables and scvelo, but when trying to run scvelo in my notebook I'm still running into the same issue. Is there anything different I need to do in my notebooks when running the new environment? Thanks! ps, sorry about the late response, was travelling in the US and only got back a few days ago.
Now I got back from holidays myself and can get back to your issue.
If it works fine in your console, but it does not run on your notebook, you might be running your notebook within the wrong environment (perhaps in root)?
You can check your environment and python version with
import sys, platform
print(sys.executable)
print(platform.python_version())
Hey Volker, Thanks for getting back to me, sorry for the delayed response, I checked the environment within the notebook as you suggested and got the following:
C:\Users\Patty\Anaconda3\python.exe
3.6.0
How do I ensure I'm not running in root? Or, how do I change from root to the new environment? Thanks!
Alright, so I think I've managed to run the notebook in the new environment as I'm currently getting the following output for environment and python version:
C:\Users\Patty\Anaconda3\envs\py36\python.exe
3.6.8
Scvelo seems to be finally importing fine now, great! But unfortunately am now running into the following error when trying to merge my preprocessed scanpy AnnData file with my loom file. Thoughts?
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-1689fa8fbbd0> in <module>
1 ldata = scv.read(path2,
2 cache=True)
----> 3 adata = scv.utils.merge(adata, ldata)
~\Anaconda3\envs\py36\lib\site-packages\scvelo\read_load.py in merge(adata, ldata, copy)
134 same_vars = (len(_adata.var_names) == len(_ldata.var_names) and np.all(_adata.var_names == _ldata.var_names))
135 if len(common_vars) > 0 and not same_vars:
--> 136 _adata._inplace_subset_var(common_vars)
137 _ldata._inplace_subset_var(common_vars)
138
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _inplace_subset_var(self, index)
1431 Same as ``adata = adata[:, index]``, but inplace.
1432 """
-> 1433 adata_subset = self[:, index].copy()
1434 self._init_as_actual(adata_subset, dtype=self._X.dtype)
1435
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in __getitem__(self, index)
1297 def __getitem__(self, index: Index) -> 'AnnData':
1298 """Returns a sliced view of the object."""
-> 1299 return self._getitem_view(index)
1300
1301 def _getitem_view(self, index: Index) -> 'AnnData':
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _getitem_view(self, index)
1300
1301 def _getitem_view(self, index: Index) -> 'AnnData':
-> 1302 oidx, vidx = self._normalize_indices(index)
1303 return AnnData(self, oidx=oidx, vidx=vidx, asview=True)
1304
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _normalize_indices(self, index)
1277 obs, var = super()._unpack_index(index)
1278 obs = _normalize_index(obs, self.obs_names)
-> 1279 var = _normalize_index(var, self.var_names)
1280 return obs, var
1281
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _normalize_index(index, names)
264 # incredibly faster one
265 positions = pd.Series(index=names, data=range(len(names)))
--> 266 positions = positions[index]
267 if positions.isnull().values.any():
268 raise KeyError(
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\series.py in __getitem__(self, key)
808 key = check_bool_indexer(self.index, key)
809
--> 810 return self._get_with(key)
811
812 def _get_with(self, key):
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\series.py in _get_with(self, key)
851 return self.loc[key]
852
--> 853 return self.reindex(key)
854 except Exception:
855 # [slice(0, 5, None)] will break if you convert to ndarray,
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\series.py in reindex(self, index, **kwargs)
3323 @Appender(generic._shared_docs['reindex'] % _shared_doc_kwargs)
3324 def reindex(self, index=None, **kwargs):
-> 3325 return super(Series, self).reindex(index=index, **kwargs)
3326
3327 def drop(self, labels=None, axis=0, index=None, columns=None,
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\generic.py in reindex(self, *args, **kwargs)
3687 # perform the reindex on the axes
3688 return self._reindex_axes(axes, level, limit, tolerance, method,
-> 3689 fill_value, copy).__finalize__(self)
3690
3691 def _reindex_axes(self, axes, level, limit, tolerance, method, fill_value,
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\generic.py in _reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
3705 obj = obj._reindex_with_indexers({axis: [new_index, indexer]},
3706 fill_value=fill_value,
-> 3707 copy=copy, allow_dups=False)
3708
3709 return obj
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\generic.py in _reindex_with_indexers(self, reindexers, fill_value, copy, allow_dups)
3808 fill_value=fill_value,
3809 allow_dups=allow_dups,
-> 3810 copy=copy)
3811
3812 if copy and new_data is self._data:
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\internals.py in reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy)
4412 # some axes don't allow reindexing with dups
4413 if not allow_dups:
-> 4414 self.axes[axis]._can_reindex(indexer)
4415
4416 if axis >= self.ndim:
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\indexes\base.py in _can_reindex(self, indexer)
3574 # trying to reindex on an axis with duplicates
3575 if not self.is_unique and len(indexer):
-> 3576 raise ValueError("cannot reindex from a duplicate axis")
3577
3578 def reindex(self, target, method=None, level=None, limit=None,
ValueError: cannot reindex from a duplicate axis
Good to hear. Would you please run
print(adata, ldata)
print(adata.obs_names.intersection(ldata.obs_names))
print(adata.var_names.intersection(ldata.var_names))
Sure thing, here's the output:
AnnData object with n_obs × n_vars = 14731 × 18596
obs: 'sample', 'n_counts', 'log_counts', 'n_genes', 'mt_frac', 'size_factors', 'S_score', 'G2M_score', 'phase', 'louvain_r1', 'louvain_r0.5', 'Chor_marker_expr', 'Radg_marker_expr', 'Chr21_marker_expr'
var: 'gene_id', 'n_cells', 'means', 'dispersions', 'dispersions_norm', 'highly_variable'
uns: 'cluster_cell_type_matching', 'diffmap_evals', 'louvain', 'louvain_r0.5_colors', 'louvain_r0.5_sizes', 'neighbors', 'paga', 'pca', 'phase_colors', 'rank_genes_r0.5', 'sample_colors'
obsm: 'X_pca', 'X_umap', 'X_diffmap'
varm: 'PCs'
layers: 'counts' AnnData object with n_obs × n_vars = 16971 × 58288
obs: 'Clusters', 'SampleID', 'SampleRef', '_X', '_Y'
var: 'Accession', 'Chromosome', 'End', 'Start', 'Strand'
layers: 'ambiguous', 'matrix', 'spliced', 'unspliced'
Index([], dtype='object', name='index')
Index(['FAM87B', 'LINC00115', 'FAM41C', 'SAMD11', 'NOC2L', 'KLHL17', 'PLEKHN1',
'HES4', 'ISG15', 'AGRN',
...
'MT-CO2', 'MT-ATP8', 'MT-ATP6', 'MT-CO3', 'MT-ND3', 'MT-ND4L', 'MT-ND4',
'MT-ND5', 'MT-ND6', 'MT-CYB'],
dtype='object', name='index', length=16229)
Looks like your observation names are not matching. Would need to examine your adata.obs_names
, how to make them fit to ldata.obs_names
.
We have an in-built module that cleans them up, i.e. scv.utils.clean_obs_names(adata)
, maybe that helps making them comparable.
Apart from that everything looks fine.
Hey Volker, Thanks for the fast reply. I tried the clean up module, but still came out with the same issue. Let me know what I can provide in order to figure this out. Thanks!
Print these to see whether they are matchable:
print(adata.obs_names, ldata.obs_names)
scv.utils.clean_obs_names(adata)
scv.utils.clean_obs_names(ldata)
print(adata.obs_names, ldata.obs_names)
Here we go:
Index(['AAACCTGAGATCACGG', 'AAACCTGAGATCCGAG', 'AAACCTGAGGAGCGTT',
'AAACCTGAGGCACATG', 'AAACCTGAGTTTGCGT', 'AAACCTGCAAGCCGCT',
'AAACCTGCACGAAATA', 'AAACCTGCACGCATCG', 'AAACCTGCACGGCTAC',
'AAACCTGCACTAAGTC',
...
'TTTGTCAAGATCCTGT', 'TTTGTCAAGCCAACAG', 'TTTGTCACACATTCGA',
'TTTGTCACAGGTCGTC', 'TTTGTCAGTAAGTGTA', 'TTTGTCAGTAGCGCAA',
'TTTGTCAGTGTCGCTG', 'TTTGTCATCGTAGATC', 'TTTGTCATCTGGTATG',
'TTTGTCATCTTGCCGT'],
dtype='object', name='index', length=14731) Index(['EU79_d45:AAAGTAGGTGTAACGGx', 'EU79_d45:AAACGGGTCTCGAGTAx',
'EU79_d45:AAAGATGCAGTATAAGx', 'EU79_d45:AAAGCAAGTGCTTCTCx',
'EU79_d45:AAACCTGAGATCCGAGx', 'EU79_d45:AAACGGGTCAAACGGGx',
'EU79_d45:AAAGCAAAGTGGGCTAx', 'EU79_d45:AAAGTAGGTGTATGGGx',
'EU79_d45:AAACCTGCACTAAGTCx', 'EU79_d45:AAAGATGTCGGACAAGx',
...
'DS18_d140:TTTGGTTAGTACGATAx', 'DS18_d140:TTTGCGCCATACTACGx',
'DS18_d140:TTTGCGCGTATAGTAGx', 'DS18_d140:TTTGGTTGTCTTCTCGx',
'DS18_d140:TTTGTCATCGTAGATCx', 'DS18_d140:TTTGTCAGTCCAAGTTx',
'DS18_d140:TTTGGTTCACGAAGCAx', 'DS18_d140:TTTGTCAGTGTCGCTGx',
'DS18_d140:TTTGCGCGTGCTTCTCx', 'DS18_d140:TTTGTCATCGTCTGCTx'],
dtype='object', name='index', length=16971)
Index(['AAACCTGAGATCACGG', 'AAACCTGAGATCCGAG', 'AAACCTGAGGAGCGTT',
'AAACCTGAGGCACATG', 'AAACCTGAGTTTGCGT', 'AAACCTGCAAGCCGCT',
'AAACCTGCACGAAATA', 'AAACCTGCACGCATCG', 'AAACCTGCACGGCTAC',
'AAACCTGCACTAAGTC',
...
'TTTGTCAAGATCCTGT', 'TTTGTCAAGCCAACAG', 'TTTGTCACACATTCGA',
'TTTGTCACAGGTCGTC', 'TTTGTCAGTAAGTGTA', 'TTTGTCAGTAGCGCAA',
'TTTGTCAGTGTCGCTG', 'TTTGTCATCGTAGATC', 'TTTGTCATCTGGTATG',
'TTTGTCATCTTGCCGT'],
dtype='object', length=14731) Index(['AAAGTAGGTGTAACGG', 'AAACGGGTCTCGAGTA', 'AAAGATGCAGTATAAG',
'AAAGCAAGTGCTTCTC', 'AAACCTGAGATCCGAG', 'AAACGGGTCAAACGGG',
'AAAGCAAAGTGGGCTA', 'AAAGTAGGTGTATGGG', 'AAACCTGCACTAAGTC',
'AAAGATGTCGGACAAG',
...
'TTTGGTTAGTACGATA', 'TTTGCGCCATACTACG', 'TTTGCGCGTATAGTAG',
'TTTGGTTGTCTTCTCG', 'TTTGTCATCGTAGATC', 'TTTGTCAGTCCAAGTT',
'TTTGGTTCACGAAGCA', 'TTTGTCAGTGTCGCTG', 'TTTGCGCGTGCTTCTC',
'TTTGTCATCGTCTGCT'],
dtype='object', length=16971)
Looks good after cleaning. Now check whether it finds common observation names for matching:
scv.utils.clean_obs_names(adata)
scv.utils.clean_obs_names(ldata)
print(adata.obs_names.intersection(ldata.obs_names))
Here we go:
Index(['AAACCTGAGATCACGG', 'AAACCTGAGATCCGAG', 'AAACCTGAGGAGCGTT',
'AAACCTGAGGCACATG', 'AAACCTGAGTTTGCGT', 'AAACCTGCAAGCCGCT',
'AAACCTGCACGAAATA', 'AAACCTGCACGCATCG', 'AAACCTGCACGGCTAC',
'AAACCTGCACTAAGTC',
...
'TTTGTCAAGATCCTGT', 'TTTGTCAAGCCAACAG', 'TTTGTCACACATTCGA',
'TTTGTCACAGGTCGTC', 'TTTGTCAGTAAGTGTA', 'TTTGTCAGTAGCGCAA',
'TTTGTCAGTGTCGCTG', 'TTTGTCATCGTAGATC', 'TTTGTCATCTGGTATG',
'TTTGTCATCTTGCCGT'],
dtype='object', length=14731)
Good, that works.
Now let's go line by line to see what is problematic:
scv.utils.clean_obs_names(adata)
scv.utils.clean_obs_names(ldata)
common_obs = adata.obs_names.intersection(ldata.obs_names)
common_vars = adata.var_names.intersection(ldata.var_names)
_adata = adata[common_obs].copy()
_ldata = ldata[common_obs].copy()
_adata._inplace_subset_var(common_vars)
_ldata._inplace_subset_var(common_vars)
Alright, this time getting an error:
Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-10-3fc6f6ea4b7e> in <module>
9
10 _adata._inplace_subset_var(common_vars)
---> 11 _ldata._inplace_subset_var(common_vars)
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _inplace_subset_var(self, index)
1431 Same as ``adata = adata[:, index]``, but inplace.
1432 """
-> 1433 adata_subset = self[:, index].copy()
1434 self._init_as_actual(adata_subset, dtype=self._X.dtype)
1435
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in __getitem__(self, index)
1297 def __getitem__(self, index: Index) -> 'AnnData':
1298 """Returns a sliced view of the object."""
-> 1299 return self._getitem_view(index)
1300
1301 def _getitem_view(self, index: Index) -> 'AnnData':
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _getitem_view(self, index)
1300
1301 def _getitem_view(self, index: Index) -> 'AnnData':
-> 1302 oidx, vidx = self._normalize_indices(index)
1303 return AnnData(self, oidx=oidx, vidx=vidx, asview=True)
1304
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _normalize_indices(self, index)
1277 obs, var = super()._unpack_index(index)
1278 obs = _normalize_index(obs, self.obs_names)
-> 1279 var = _normalize_index(var, self.var_names)
1280 return obs, var
1281
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _normalize_index(index, names)
264 # incredibly faster one
265 positions = pd.Series(index=names, data=range(len(names)))
--> 266 positions = positions[index]
267 if positions.isnull().values.any():
268 raise KeyError(
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\series.py in __getitem__(self, key)
808 key = check_bool_indexer(self.index, key)
809
--> 810 return self._get_with(key)
811
812 def _get_with(self, key):
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\series.py in _get_with(self, key)
851 return self.loc[key]
852
--> 853 return self.reindex(key)
854 except Exception:
855 # [slice(0, 5, None)] will break if you convert to ndarray,
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\series.py in reindex(self, index, **kwargs)
3323 @Appender(generic._shared_docs['reindex'] % _shared_doc_kwargs)
3324 def reindex(self, index=None, **kwargs):
-> 3325 return super(Series, self).reindex(index=index, **kwargs)
3326
3327 def drop(self, labels=None, axis=0, index=None, columns=None,
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\generic.py in reindex(self, *args, **kwargs)
3687 # perform the reindex on the axes
3688 return self._reindex_axes(axes, level, limit, tolerance, method,
-> 3689 fill_value, copy).__finalize__(self)
3690
3691 def _reindex_axes(self, axes, level, limit, tolerance, method, fill_value,
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\generic.py in _reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
3705 obj = obj._reindex_with_indexers({axis: [new_index, indexer]},
3706 fill_value=fill_value,
-> 3707 copy=copy, allow_dups=False)
3708
3709 return obj
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\generic.py in _reindex_with_indexers(self, reindexers, fill_value, copy, allow_dups)
3808 fill_value=fill_value,
3809 allow_dups=allow_dups,
-> 3810 copy=copy)
3811
3812 if copy and new_data is self._data:
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\internals.py in reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy)
4412 # some axes don't allow reindexing with dups
4413 if not allow_dups:
-> 4414 self.axes[axis]._can_reindex(indexer)
4415
4416 if axis >= self.ndim:
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\indexes\base.py in _can_reindex(self, indexer)
3574 # trying to reindex on an axis with duplicates
3575 if not self.is_unique and len(indexer):
-> 3576 raise ValueError("cannot reindex from a duplicate axis")
3577
3578 def reindex(self, target, method=None, level=None, limit=None,
ValueError: cannot reindex from a duplicate axis
Now change it to:
import pandas as pd
scv.utils.clean_obs_names(adata)
scv.utils.clean_obs_names(ldata)
adata.obs_names_make_unique()
ldata.obs_names_make_unique()
adata.var_names_make_unique()
ldata.var_names_make_unique()
common_obs = adata.obs_names.intersection(ldata.obs_names)
common_vars = pd.unique(adata.var_names.intersection(ldata.var_names))
_adata = adata[common_obs].copy()
_ldata = ldata[common_obs].copy()
_adata._inplace_subset_var(common_vars)
_ldata._inplace_subset_var(common_vars)
Similar error again:
Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-11-1d4e9e32d746> in <module>
17
18 _adata._inplace_subset_var(common_vars)
---> 19 _ldata._inplace_subset_var(common_vars)
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _inplace_subset_var(self, index)
1431 Same as ``adata = adata[:, index]``, but inplace.
1432 """
-> 1433 adata_subset = self[:, index].copy()
1434 self._init_as_actual(adata_subset, dtype=self._X.dtype)
1435
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in __getitem__(self, index)
1297 def __getitem__(self, index: Index) -> 'AnnData':
1298 """Returns a sliced view of the object."""
-> 1299 return self._getitem_view(index)
1300
1301 def _getitem_view(self, index: Index) -> 'AnnData':
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _getitem_view(self, index)
1300
1301 def _getitem_view(self, index: Index) -> 'AnnData':
-> 1302 oidx, vidx = self._normalize_indices(index)
1303 return AnnData(self, oidx=oidx, vidx=vidx, asview=True)
1304
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _normalize_indices(self, index)
1277 obs, var = super()._unpack_index(index)
1278 obs = _normalize_index(obs, self.obs_names)
-> 1279 var = _normalize_index(var, self.var_names)
1280 return obs, var
1281
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _normalize_index(index, names)
264 # incredibly faster one
265 positions = pd.Series(index=names, data=range(len(names)))
--> 266 positions = positions[index]
267 if positions.isnull().values.any():
268 raise KeyError(
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\series.py in __getitem__(self, key)
808 key = check_bool_indexer(self.index, key)
809
--> 810 return self._get_with(key)
811
812 def _get_with(self, key):
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\series.py in _get_with(self, key)
851 return self.loc[key]
852
--> 853 return self.reindex(key)
854 except Exception:
855 # [slice(0, 5, None)] will break if you convert to ndarray,
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\series.py in reindex(self, index, **kwargs)
3323 @Appender(generic._shared_docs['reindex'] % _shared_doc_kwargs)
3324 def reindex(self, index=None, **kwargs):
-> 3325 return super(Series, self).reindex(index=index, **kwargs)
3326
3327 def drop(self, labels=None, axis=0, index=None, columns=None,
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\generic.py in reindex(self, *args, **kwargs)
3687 # perform the reindex on the axes
3688 return self._reindex_axes(axes, level, limit, tolerance, method,
-> 3689 fill_value, copy).__finalize__(self)
3690
3691 def _reindex_axes(self, axes, level, limit, tolerance, method, fill_value,
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\generic.py in _reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
3705 obj = obj._reindex_with_indexers({axis: [new_index, indexer]},
3706 fill_value=fill_value,
-> 3707 copy=copy, allow_dups=False)
3708
3709 return obj
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\generic.py in _reindex_with_indexers(self, reindexers, fill_value, copy, allow_dups)
3808 fill_value=fill_value,
3809 allow_dups=allow_dups,
-> 3810 copy=copy)
3811
3812 if copy and new_data is self._data:
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\internals.py in reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy)
4412 # some axes don't allow reindexing with dups
4413 if not allow_dups:
-> 4414 self.axes[axis]._can_reindex(indexer)
4415
4416 if axis >= self.ndim:
~\AppData\Roaming\Python\Python36\site-packages\pandas\core\indexes\base.py in _can_reindex(self, indexer)
3574 # trying to reindex on an axis with duplicates
3575 if not self.is_unique and len(indexer):
-> 3576 raise ValueError("cannot reindex from a duplicate axis")
3577
3578 def reindex(self, target, method=None, level=None, limit=None,
ValueError: cannot reindex from a duplicate axis
Can you subset _ldata[:, common_vars]
?
What's your pandas version ? import pandas as pd; print(pd.__version__)
. Maybe upgrading to the latest version helps?
Currently running pandas 0.23.4
As for the subsetting, should I just run the line you wrote out?
Yes, you can ugprade pandas (latest version: 0.24.2
) and see whether you can subset.
Alright, so updated pandas to 0.24.2
and restarted the notebook.
After attempting to subset, I got the same error as previous:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-14-baca2ec35139> in <module>
----> 1 _ldata[:, common_vars]
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in __getitem__(self, index)
1297 def __getitem__(self, index: Index) -> 'AnnData':
1298 """Returns a sliced view of the object."""
-> 1299 return self._getitem_view(index)
1300
1301 def _getitem_view(self, index: Index) -> 'AnnData':
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _getitem_view(self, index)
1300
1301 def _getitem_view(self, index: Index) -> 'AnnData':
-> 1302 oidx, vidx = self._normalize_indices(index)
1303 return AnnData(self, oidx=oidx, vidx=vidx, asview=True)
1304
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _normalize_indices(self, index)
1277 obs, var = super()._unpack_index(index)
1278 obs = _normalize_index(obs, self.obs_names)
-> 1279 var = _normalize_index(var, self.var_names)
1280 return obs, var
1281
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _normalize_index(index, names)
264 # incredibly faster one
265 positions = pd.Series(index=names, data=range(len(names)))
--> 266 positions = positions[index]
267 if positions.isnull().values.any():
268 raise KeyError(
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
909 key = check_bool_indexer(self.index, key)
910
--> 911 return self._get_with(key)
912
913 def _get_with(self, key):
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\series.py in _get_with(self, key)
951 return self.loc[key]
952
--> 953 return self.reindex(key)
954 except Exception:
955 # [slice(0, 5, None)] will break if you convert to ndarray,
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\series.py in reindex(self, index, **kwargs)
3736 @Appender(generic.NDFrame.reindex.__doc__)
3737 def reindex(self, index=None, **kwargs):
-> 3738 return super(Series, self).reindex(index=index, **kwargs)
3739
3740 def drop(self, labels=None, axis=0, index=None, columns=None,
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\generic.py in reindex(self, *args, **kwargs)
4354 # perform the reindex on the axes
4355 return self._reindex_axes(axes, level, limit, tolerance, method,
-> 4356 fill_value, copy).__finalize__(self)
4357
4358 def _reindex_axes(self, axes, level, limit, tolerance, method, fill_value,
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\generic.py in _reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
4372 obj = obj._reindex_with_indexers({axis: [new_index, indexer]},
4373 fill_value=fill_value,
-> 4374 copy=copy, allow_dups=False)
4375
4376 return obj
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\generic.py in _reindex_with_indexers(self, reindexers, fill_value, copy, allow_dups)
4488 fill_value=fill_value,
4489 allow_dups=allow_dups,
-> 4490 copy=copy)
4491
4492 if copy and new_data is self._data:
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\internals\managers.py in reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy)
1222 # some axes don't allow reindexing with dups
1223 if not allow_dups:
-> 1224 self.axes[axis]._can_reindex(indexer)
1225
1226 if axis >= self.ndim:
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\indexes\base.py in _can_reindex(self, indexer)
3085 # trying to reindex on an axis with duplicates
3086 if not self.is_unique and len(indexer):
-> 3087 raise ValueError("cannot reindex from a duplicate axis")
3088
3089 def reindex(self, target, method=None, level=None, limit=None,
ValueError: cannot reindex from a duplicate axis
Strange.. Would you also upgrade anndata to 0.6.20
?
Does that run through without giving an error? _ldata[:, _ldata.var_names[:, 5]]
Upgraded anndata to 0.6.20
and got the following error after running that line:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-14-dfcfc175223e> in <module>
----> 1 _ldata[:, _ldata.var_names[:, 5]]
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\indexes\base.py in __getitem__(self, key)
3967
3968 key = com.values_from_object(key)
-> 3969 result = getitem(key)
3970 if not is_scalar(result):
3971 return promote(result)
IndexError: too many indices for array
Ah, I meant _ldata[:, _ldata.var_names[:5]]
Ha, sadly its a similar result to the previous errors we've seen:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-16-9f4708acfdc9> in <module>
----> 1 _ldata[:, _ldata.var_names[:5]]
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in __getitem__(self, index)
1320 def __getitem__(self, index: Index) -> 'AnnData':
1321 """Returns a sliced view of the object."""
-> 1322 return self._getitem_view(index)
1323
1324 def _getitem_view(self, index: Index) -> 'AnnData':
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _getitem_view(self, index)
1323
1324 def _getitem_view(self, index: Index) -> 'AnnData':
-> 1325 oidx, vidx = self._normalize_indices(index)
1326 return AnnData(self, oidx=oidx, vidx=vidx, asview=True)
1327
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _normalize_indices(self, index)
1300 obs, var = unpack_index(index)
1301 obs = _normalize_index(obs, self.obs_names)
-> 1302 var = _normalize_index(var, self.var_names)
1303 return obs, var
1304
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _normalize_index(index, names)
263 # incredibly faster one
264 positions = pd.Series(index=names, data=range(len(names)))
--> 265 positions = positions[index]
266 if positions.isnull().values.any():
267 not_found = positions.index[positions.isnull().values]
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
909 key = check_bool_indexer(self.index, key)
910
--> 911 return self._get_with(key)
912
913 def _get_with(self, key):
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\series.py in _get_with(self, key)
951 return self.loc[key]
952
--> 953 return self.reindex(key)
954 except Exception:
955 # [slice(0, 5, None)] will break if you convert to ndarray,
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\series.py in reindex(self, index, **kwargs)
3736 @Appender(generic.NDFrame.reindex.__doc__)
3737 def reindex(self, index=None, **kwargs):
-> 3738 return super(Series, self).reindex(index=index, **kwargs)
3739
3740 def drop(self, labels=None, axis=0, index=None, columns=None,
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\generic.py in reindex(self, *args, **kwargs)
4354 # perform the reindex on the axes
4355 return self._reindex_axes(axes, level, limit, tolerance, method,
-> 4356 fill_value, copy).__finalize__(self)
4357
4358 def _reindex_axes(self, axes, level, limit, tolerance, method, fill_value,
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\generic.py in _reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
4372 obj = obj._reindex_with_indexers({axis: [new_index, indexer]},
4373 fill_value=fill_value,
-> 4374 copy=copy, allow_dups=False)
4375
4376 return obj
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\generic.py in _reindex_with_indexers(self, reindexers, fill_value, copy, allow_dups)
4488 fill_value=fill_value,
4489 allow_dups=allow_dups,
-> 4490 copy=copy)
4491
4492 if copy and new_data is self._data:
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\internals\managers.py in reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy)
1222 # some axes don't allow reindexing with dups
1223 if not allow_dups:
-> 1224 self.axes[axis]._can_reindex(indexer)
1225
1226 if axis >= self.ndim:
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\indexes\base.py in _can_reindex(self, indexer)
3085 # trying to reindex on an axis with duplicates
3086 if not self.is_unique and len(indexer):
-> 3087 raise ValueError("cannot reindex from a duplicate axis")
3088
3089 def reindex(self, target, method=None, level=None, limit=None,
ValueError: cannot reindex from a duplicate axis
Same applies to ldata[:, ldata.var_names[:5]]
and adata[:, adata.var_names[:5]]
?
If that's so, this is a very fundamental problem.. still working on windows, right?
For ldata
get the same error: ValueError: cannot reindex from a duplicate axis
.
For adata
however I get an actual output:
(and yes still on windows, though more and more considering making the switch to mac)
View of AnnData object with n_obs × n_vars = 14731 × 5
obs: 'sample', 'n_counts', 'log_counts', 'n_genes', 'mt_frac', 'size_factors', 'S_score', 'G2M_score', 'phase', 'louvain_r1', 'louvain_r0.5', 'Chor_marker_expr', 'Radg_marker_expr', 'Chr21_marker_expr'
var: 'gene_id', 'n_cells', 'means', 'dispersions', 'dispersions_norm', 'highly_variable'
uns: 'cluster_cell_type_matching', 'diffmap_evals', 'louvain', 'louvain_r0.5_colors', 'louvain_r0.5_sizes', 'neighbors', 'paga', 'pca', 'phase_colors', 'rank_genes_r0.5', 'sample_colors'
obsm: 'X_pca', 'X_umap', 'X_diffmap'
varm: 'PCs'
layers: 'counts'
What if you first run
scv.pp.filter_and_normalize(ldata, min_shared_counts=30)
first, then try subsetting again.
Tried running that before subsetting but got the following error:
(should it be min_counts
?)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-12-a884d6b99cb4> in <module>
----> 1 scv.pp.filter_and_normalize(ldata, min_shared_counts=30)
TypeError: filter_and_normalize() got an unexpected keyword argument 'min_shared_counts'
min_shared_counts
attribute is available in scvelo v0.1.17. Are you not running on the latest version?
What is your scv.logging.print_versions()
?
Sorry, thought I had run updating scvelo, onto v0.1.17
now.
Versions used are: scvelo==0.1.17 scanpy==1.4 anndata==0.6.20 loompy==2.0.17 numpy==1.16.3 scipy==1.1.0 matplotlib==3.0.3 sklearn==0.20.3 pandas==0.24.2
After running the filter and normalize line, I got the following:
Filtered out 49211 genes that are detected in less than 30 counts (shared).
Normalized count data: X, spliced, unspliced.
Logarithmized X.
But then, after running _ldata[:, common_vars]
, back to the usual error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-13-baca2ec35139> in <module>
----> 1 _ldata[:, common_vars]
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in __getitem__(self, index)
1320 def __getitem__(self, index: Index) -> 'AnnData':
1321 """Returns a sliced view of the object."""
-> 1322 return self._getitem_view(index)
1323
1324 def _getitem_view(self, index: Index) -> 'AnnData':
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _getitem_view(self, index)
1323
1324 def _getitem_view(self, index: Index) -> 'AnnData':
-> 1325 oidx, vidx = self._normalize_indices(index)
1326 return AnnData(self, oidx=oidx, vidx=vidx, asview=True)
1327
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _normalize_indices(self, index)
1300 obs, var = unpack_index(index)
1301 obs = _normalize_index(obs, self.obs_names)
-> 1302 var = _normalize_index(var, self.var_names)
1303 return obs, var
1304
~\Anaconda3\envs\py36\lib\site-packages\anndata\base.py in _normalize_index(index, names)
263 # incredibly faster one
264 positions = pd.Series(index=names, data=range(len(names)))
--> 265 positions = positions[index]
266 if positions.isnull().values.any():
267 not_found = positions.index[positions.isnull().values]
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
909 key = check_bool_indexer(self.index, key)
910
--> 911 return self._get_with(key)
912
913 def _get_with(self, key):
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\series.py in _get_with(self, key)
951 return self.loc[key]
952
--> 953 return self.reindex(key)
954 except Exception:
955 # [slice(0, 5, None)] will break if you convert to ndarray,
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\series.py in reindex(self, index, **kwargs)
3736 @Appender(generic.NDFrame.reindex.__doc__)
3737 def reindex(self, index=None, **kwargs):
-> 3738 return super(Series, self).reindex(index=index, **kwargs)
3739
3740 def drop(self, labels=None, axis=0, index=None, columns=None,
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\generic.py in reindex(self, *args, **kwargs)
4354 # perform the reindex on the axes
4355 return self._reindex_axes(axes, level, limit, tolerance, method,
-> 4356 fill_value, copy).__finalize__(self)
4357
4358 def _reindex_axes(self, axes, level, limit, tolerance, method, fill_value,
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\generic.py in _reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
4372 obj = obj._reindex_with_indexers({axis: [new_index, indexer]},
4373 fill_value=fill_value,
-> 4374 copy=copy, allow_dups=False)
4375
4376 return obj
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\generic.py in _reindex_with_indexers(self, reindexers, fill_value, copy, allow_dups)
4488 fill_value=fill_value,
4489 allow_dups=allow_dups,
-> 4490 copy=copy)
4491
4492 if copy and new_data is self._data:
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\internals\managers.py in reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy)
1222 # some axes don't allow reindexing with dups
1223 if not allow_dups:
-> 1224 self.axes[axis]._can_reindex(indexer)
1225
1226 if axis >= self.ndim:
~\Anaconda3\envs\py36\lib\site-packages\pandas\core\indexes\base.py in _can_reindex(self, indexer)
3085 # trying to reindex on an axis with duplicates
3086 if not self.is_unique and len(indexer):
-> 3087 raise ValueError("cannot reindex from a duplicate axis")
3088
3089 def reindex(self, target, method=None, level=None, limit=None,
ValueError: cannot reindex from a duplicate axis
and the same with ldata[:, ldata.var_names[:5]]
?
@flying-sheep would be very grateful if you could have a quick look. What causes subsetting of an AnnData object to raise ValueError: cannot reindex from a duplicate axis
To make sure nothing is wrong with the cached data, re-load your ldata with ldata = scv.read(path2, cache=False)
and try again subsetting.
After re-loading data, when running ldata[:, ldata.var_names[:5]]
I'm finally getting an output:
View of AnnData object with n_obs × n_vars = 16971 × 5
obs: 'Clusters', 'SampleID', 'SampleRef', '_X', '_Y', 'sample_batch', 'initial_size_spliced', 'initial_size_unspliced', 'initial_size', 'n_counts'
var: 'Accession', 'Chromosome', 'End', 'Start', 'Strand'
layers: 'ambiguous', 'matrix', 'spliced', 'unspliced'
Apparently it was the cache. Now you can redo the merge.
Alright, so if I run scv.pp.filter_and_normalize(ldata, min_shared_counts=20)
before attempting to merge adata
and ldata
it works great and I can progress, if however I don't run it, then I run into the same indexing error when attempting to merge.
Following your notebook examples, I've finally been able to generate velocity plots.
I did however run into an error when attempting to run rank_velocity_genes
similar to issue #64. I'm running scvelo 0.1.17
though had updated through pypi. Does it have to be through source to fix the error?
Happy to send my notebook in case you wanted to have a look I didn't stuff anything up along the way. Thanks for all the help so far!
Not sure, why filtering fixes this; prob related to anndata
.
Just released v0.1.18
(with rank_velocity_genes
being stable).
Hi, I'm getting the following error when trying to initially import scvelo through
import scvelo as scv
(currently running scvelo 0.1.16, scanpy 1.4, numpy 1.15.4). Any help would be greatly appreciated, thanks!TypeError Traceback (most recent call last)