scverse / scanpy

Single-cell analysis in Python. Scales to >1M cells.
https://scanpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.87k stars 595 forks source link

UMAP is unreproducible between 3 Windows PCs #2114

Open hyjforesight opened 2 years ago

hyjforesight commented 2 years ago

Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Hello Scanpy, This BUG is quite weird. It starts since we installed Anaconda3-v2021.11 on 3 individual Windows PCs. We run the same dataset by the same coding. However, it generates 3 different UMAPs. The coding is as below. UMAP of Windows PC2 is consistent with our previous UMAPs done on PC1 and PC2 in November with Anaconda3-v2021.05. Could you please help us with this issue? Thanks! Best, YJ

Minimal code sample (that we can copy&paste without having any data)

import numpy as np
import pandas as pd
import scanpy as sc
import scanpy.external as sce
import scipy
sc.settings.verbosity = 3
sc.logging.print_header()
sc.set_figure_params(dpi=100, dpi_save=600)

adata = sc.read_loom(filename='C:/Users/Park_Lab/Documents/Tumor.loom')
adata.var_names_make_unique()
adata
sc.pl.highest_expr_genes(adata, n_top=20)
sc.pp.filter_cells(adata, min_genes=100)
sc.pp.filter_genes(adata, min_cells=25)
adata.var['mt'] = adata.var_names.str.startswith('mt-')
adata.var['rpl'] = adata.var_names.str.startswith('Rpl')
adata.var['rps'] = adata.var_names.str.startswith('Rps')
adata
sc.pp.calculate_qc_metrics(adata, qc_vars=['mt','rpl','rps'], percent_top=None, log1p=False, inplace=True)
sc.pl.violin(adata, keys=['n_genes_by_counts', 'total_counts', 'pct_counts_mt','pct_counts_rpl','pct_counts_rps'], jitter=0.4, multi_panel=True)
adata
sc.pl.scatter(adata, x='total_counts', y='pct_counts_mt')
sc.pl.scatter(adata, x='total_counts', y='pct_counts_rpl')
sc.pl.scatter(adata, x='total_counts', y='pct_counts_rps')
sc.pl.scatter(adata, x='total_counts', y='n_genes_by_counts')
adata = adata[adata.obs.n_genes_by_counts < 6000, :]
adata = adata[adata.obs.pct_counts_mt < 50, :]
adata = adata[adata.obs.pct_counts_rpl < 50, :]
adata = adata[adata.obs.pct_counts_rps < 50, :]
adata
sc.pp.normalize_total(adata)
sc.pp.log1p(adata)
adata
sc.pp.highly_variable_genes(adata, n_top_genes=5000)
sc.pl.highly_variable_genes(adata)
print(sum(adata.var.highly_variable))
adata
adata.raw=adata
adata = adata[:, adata.var.highly_variable]
adata
sc.pp.regress_out(adata, keys=['pct_counts_mt','pct_counts_rpl','pct_counts_rps'], n_jobs=16)
sc.pp.scale(adata, max_value=10)
adata
sc.tl.pca(adata, svd_solver='arpack')
sc.pp.neighbors(adata, n_pcs=50, knn=True)
sc.tl.leiden(adata, resolution=1)
sc.tl.umap(adata)
adata
sc.pl.umap(adata, color=['leiden'], legend_loc='on data', frameon=False, title='', use_raw=False)
sc.pl.umap(adata, color=['leiden'], legend_loc='', frameon=False, title='', save='ACT.pdf', use_raw=False)

UMAP of Windows PC1 (i7-1065G7, Windows 11 x64 21H2) image UMAP of Windows PC2 (i7-10700, Windows 10 x64 1809) image UMAP of Windows PC3 (Xeon Silver 4210, Windows 10 x64 1809) image

Versions

Windows PC1 |   | Windows PC2 |   | Windows PC3 |   -- | -- | -- | -- | -- | -- adjustText | 0.7.3 | adjustText | 0.7.3 | adjustText | 0.7.3 aiohttp | 3.8.1 | aiohttp | 3.8.1 | aiohttp | 3.8.1 aiosignal | 1.2.0 | aiosignal | 1.2.0 | aiosignal | 1.2.0 anndata | 0.7.8 | anndata | 0.7.8 | anndata | 0.7.8 anyio | 2.2.0 | anyio | 2.2.0 | anyio | 2.2.0 arboreto | 0.1.6 | arboreto | 0.1.6 | arboreto | 0.1.6 argon2-cffi | 20.1.0 | argon2-cffi | 20.1.0 | argon2-cffi | 20.1.0 async-generator | 1.1 | async-generator | 1.1 | async-generator | 1.1 async-timeout | 4.0.2 | async-timeout | 4.0.2 | async-timeout | 4.0.2 attrs | 21.4.0 | attrs | 21.2.0 | attrs | 21.4.0 Babel | 2.9.1 | Babel | 2.9.1 | Babel | 2.9.1 backcall | 0.2.0 | backcall | 0.2.0 | backcall | 0.2.0 bleach | 4.1.0 | bleach | 4.1.0 | bleach | 4.1.0 bokeh | 2.4.2 | bokeh | 2.4.2 | bokeh | 2.4.2 boltons | 21.0.0 | boltons | 21.0.0 | boltons | 21.0.0 brotlipy | 0.7.0 | brotlipy | 0.7.0 | brotlipy | 0.7.0 cellrank | 1.5.1 | cellrank | 1.5.1 | cellrank | 1.5.1 certifi | 2020.6.20 | certifi | 2020.6.20 | certifi | 2020.6.20 cffi | 1.15.0 | cffi | 1.15.0 | cffi | 1.15.0 charset-normalizer | 2.0.4 | charset-normalizer | 2.0.4 | charset-normalizer | 2.0.4 click | 8.0.3 | click | 8.0.3 | click | 8.0.3 cloudpickle | 2.0.0 | cloudpickle | 2.0.0 | cloudpickle | 2.0.0 colorama | 0.4.4 | colorama | 0.4.4 | colorama | 0.4.4 cryptography | 36.0.0 | cryptography | 36.0.0 | cryptography | 36.0.0 ctxcore | 0.1.1 | ctxcore | 0.1.1 | ctxcore | 0.1.1 cycler | 0.11.0 | cycler | 0.11.0 | cycler | 0.11.0 cytoolz | 0.11.0 | cytoolz | 0.11.0 | cytoolz | 0.11.0 dask | 2022.1.0 | dask | 2022.1.0 | dask | 2022.1.0 debugpy | 1.5.1 | debugpy | 1.5.1 | debugpy | 1.5.1 decorator | 5.1.0 | decorator | 5.1.0 | decorator | 5.1.0 defusedxml | 0.7.1 | defusedxml | 0.7.1 | defusedxml | 0.7.1 dill | 0.3.4 | dill | 0.3.4 | dill | 0.3.4 distributed | 2022.1.0 | distributed | 2022.1.0 | distributed | 2022.1.0 docrep | 0.3.2 | docrep | 0.3.2 | docrep | 0.3.2 entrypoints | 0.3 | entrypoints | 0.3 | entrypoints | 0.3 et-xmlfile | 1.1.0 | et-xmlfile | 1.1.0 | et-xmlfile | 1.1.0 fonttools | 4.28.5 | fonttools | 4.28.5 | fonttools | 4.28.5 frozendict | 2.2.0 | frozendict | 2.1.3 | frozendict | 2.2.0 frozenlist | 1.3.0 | frozenlist | 1.2.0 | frozenlist | 1.3.0 fsspec | 2022.1.0 | fsspec | 2022.1.0 | fsspec | 2022.1.0 future | 0.18.2 | future | 0.18.2 | future | 0.18.2   |   | gtfparse | 1.2.1 |   |   h5py | 3.6.0 | h5py | 3.6.0 | h5py | 3.6.0 HeapDict | 1.0.1 | HeapDict | 1.0.1 | HeapDict | 1.0.1 idna | 3.3 | idna | 3.3 | idna | 3.3 igraph | 0.9.9 | igraph | 0.9.9 | igraph | 0.9.9 importlib-metadata | 4.8.2 | importlib-metadata | 4.8.2 | importlib-metadata | 4.8.2   |   | infercnvpy | 0.2.0 |   |   interlap | 0.2.7 | interlap | 0.2.7 | interlap | 0.2.7 ipykernel | 6.4.1 | ipykernel | 6.4.1 | ipykernel | 6.4.1 ipython | 7.29.0 | ipython | 7.29.0 | ipython | 7.29.0 ipython-genutils | 0.2.0 | ipython-genutils | 0.2.0 | ipython-genutils | 0.2.0 ipywidgets | 7.6.5 | ipywidgets | 7.6.5 | ipywidgets | 7.6.5 jedi | 0.18.0 | jedi | 0.18.0 | jedi | 0.18.0 Jinja2 | 3.0.2 | Jinja2 | 3.0.2 | Jinja2 | 3.0.2 joblib | 1.1.0 | joblib | 1.1.0 | joblib | 1.1.0 json5 | 0.9.6 | json5 | 0.9.6 | json5 | 0.9.6 jsonschema | 3.2.0 | jsonschema | 3.2.0 | jsonschema | 3.2.0 jupyter-client | 7.1.0 | jupyter-client | 7.1.0 | jupyter-client | 7.1.0 jupyter-core | 4.9.1 | jupyter-core | 4.9.1 | jupyter-core | 4.9.1 jupyter-server | 1.4.1 | jupyter-server | 1.4.1 | jupyter-server | 1.4.1 jupyterlab | 3.2.1 | jupyterlab | 3.2.1 | jupyterlab | 3.2.1 jupyterlab-pygments | 0.1.2 | jupyterlab-pygments | 0.1.2 | jupyterlab-pygments | 0.1.2 jupyterlab-server | 2.10.2 | jupyterlab-server | 2.10.2 | jupyterlab-server | 2.10.2 jupyterlab-widgets | 1.0.2 | jupyterlab-widgets | 1.0.2 | jupyterlab-widgets | 1.0.2 kiwisolver | 1.3.2 | kiwisolver | 1.3.2 | kiwisolver | 1.3.2 leidenalg | 0.8.8 | leidenalg | 0.8.8 | leidenalg | 0.8.8 llvmlite | 0.38.0 | llvmlite | 0.38.0 | llvmlite | 0.38.0 locket | 0.2.1 | locket | 0.2.1 | locket | 0.2.1 loompy | 3.0.6 | loompy | 3.0.6 | loompy | 3.0.6 MarkupSafe | 2.0.1 | MarkupSafe | 2.0.1 | MarkupSafe | 2.0.1 matplotlib | 3.5.1 | matplotlib | 3.5.1 | matplotlib | 3.5.1 matplotlib-inline | 0.1.2 | matplotlib-inline | 0.1.2 | matplotlib-inline | 0.1.2 mistune | 0.8.4 | mistune | 0.8.4 | mistune | 0.8.4 msgpack | 1.0.3 | msgpack | 1.0.3 | msgpack | 1.0.3 multidict | 5.2.0 | multidict | 5.2.0 | multidict | 5.2.0 multiprocessing-on-dill | 3.5.0a4 | multiprocessing-on-dill | 3.5.0a4 | multiprocessing-on-dill | 3.5.0a4 natsort | 8.0.2 | natsort | 8.0.2 | natsort | 8.0.2 nbclassic | 0.2.6 | nbclassic | 0.2.6 | nbclassic | 0.2.6 nbclient | 0.5.3 | nbclient | 0.5.3 | nbclient | 0.5.3 nbconvert | 6.1.0 | nbconvert | 6.1.0 | nbconvert | 6.1.0 nbformat | 5.1.3 | nbformat | 5.1.3 | nbformat | 5.1.3 nest-asyncio | 1.5.1 | nest-asyncio | 1.5.1 | nest-asyncio | 1.5.1 networkx | 2.6.3 | networkx | 2.6.3 | networkx | 2.6.3 notebook | 6.4.6 | notebook | 6.4.6 | notebook | 6.4.6 numba | 0.55.0 | numba | 0.55.0 | numba | 0.55.0 numexpr | 2.8.1 | numexpr | 2.8.1 | numexpr | 2.8.1 numpy | 1.21.5 | numpy | 1.21.5 | numpy | 1.21.5 numpy-groupies | 0.9.14 | numpy-groupies | 0.9.14 | numpy-groupies | 0.9.14 openpyxl | 3.0.9 | openpyxl | 3.0.9 | openpyxl | 3.0.9 packaging | 21.3 | packaging | 21.3 | packaging | 21.3 pandas | 1.3.5 | pandas | 1.3.5 | pandas | 1.3.5 pandocfilters | 1.4.3 | pandocfilters | 1.4.3 | pandocfilters | 1.4.3 parso | 0.8.3 | parso | 0.8.3 | parso | 0.8.3 partd | 1.2.0 | partd | 1.2.0 | partd | 1.2.0 patsy | 0.5.2 | patsy | 0.5.2 | patsy | 0.5.2 pickleshare | 0.7.5 | pickleshare | 0.7.5 | pickleshare | 0.7.5 Pillow | 9.0.0 | Pillow | 9.0.0 | Pillow | 9.0.0 pip | 21.2.2 | pip | 21.2.2 | pip | 21.2.2 progressbar2 | 4.0.0 | progressbar2 | 4.0.0 | progressbar2 | 4.0.0 prometheus-client | 0.12.0 | prometheus-client | 0.12.0 | prometheus-client | 0.12.0 prompt-toolkit | 3.0.20 | prompt-toolkit | 3.0.20 | prompt-toolkit | 3.0.20 psutil | 5.9.0 | psutil | 5.9.0 | psutil | 5.9.0 pyarrow | 0.16.0 | pyarrow | 0.16.0 | pyarrow | 0.16.0   |   | pycairo | 1.20.1 |   |   pycparser | 2.21 | pycparser | 2.21 | pycparser | 2.21 pygam | 0.8.0 | pygam | 0.8.0 | pygam | 0.8.0 Pygments | 2.10.0 | Pygments | 2.10.0 | Pygments | 2.10.0 pygpcca | 1.0.3 | pygpcca | 1.0.3 | pygpcca | 1.0.3 pynndescent | 0.5.5 | pynndescent | 0.5.5 | pynndescent | 0.5.5 pyOpenSSL | 21.0.0 | pyOpenSSL | 21.0.0 | pyOpenSSL | 21.0.0 pyparsing | 3.0.4 | pyparsing | 3.0.4 | pyparsing | 3.0.4 pyrsistent | 0.18.0 | pyrsistent | 0.18.0 | pyrsistent | 0.18.0 pyscenic | 0.11.2 | pyscenic | 0.11.2 | pyscenic | 0.11.2 PySocks | 1.7.1 | PySocks | 1.7.1 | PySocks | 1.7.1 python-dateutil | 2.8.2 | python-dateutil | 2.8.2 | python-dateutil | 2.8.2 python-igraph | 0.9.9 | python-igraph | 0.9.9 | python-igraph | 0.9.9 python-utils | 3.1.0 | python-utils | 3.1.0 | python-utils | 3.1.0   |   | pytoml | 0.1.21 |   |   pytz | 2021.3 | pytz | 2021.3 | pytz | 2021.3 pywin32 | 302 | pywin32 | 302 | pywin32 | 302 pywinpty | 0.5.7 | pywinpty | 0.5.7 | pywinpty | 0.5.7 PyYAML | 6 | PyYAML | 6 | PyYAML | 6 pyzmq | 22.3.0 | pyzmq | 22.3.0 | pyzmq | 22.3.0 requests | 2.27.1 | requests | 2.27.1 | requests | 2.27.1 scanpy | 1.8.2 | scanpy | 1.8.2 | scanpy | 1.8.2 scikit-learn | 1.0.2 | scikit-learn | 1.0.2 | scikit-learn | 1.0.2   |   | scikit-misc | 0.1.4 |   |   scipy | 1.7.3 | scipy | 1.7.3 | scipy | 1.7.3 scvelo | 0.2.4 | scvelo | 0.2.4 | scvelo | 0.2.4 seaborn | 0.11.2 | seaborn | 0.11.2 | seaborn | 0.11.2 Send2Trash | 1.8.0 | Send2Trash | 1.8.0 | Send2Trash | 1.8.0 setuptools | 58.0.4 | setuptools | 58.0.4 | setuptools | 58.0.4   |   | setuptools-scm | 6.3.2 |   |   sinfo | 0.3.4 | sinfo | 0.3.4 | sinfo | 0.3.4 six | 1.16.0 | six | 1.16.0 | six | 1.16.0 sniffio | 1.2.0 | sniffio | 1.2.0 | sniffio | 1.2.0 sortedcontainers | 2.4.0 | sortedcontainers | 2.4.0 | sortedcontainers | 2.4.0 statsmodels | 0.13.1 | statsmodels | 0.13.1 | statsmodels | 0.13.1 stdlib-list | 0.8.0 | stdlib-list | 0.8.0 | stdlib-list | 0.8.0 tables | 3.6.1 | tables | 3.6.1 | tables | 3.6.1 tblib | 1.7.0 | tblib | 1.7.0 | tblib | 1.7.0 terminado | 0.9.4 | terminado | 0.9.4 | terminado | 0.9.4 testpath | 0.5.0 | testpath | 0.5.0 | testpath | 0.5.0 texttable | 1.6.4 | texttable | 1.6.4 | texttable | 1.6.4 threadpoolctl | 3.0.0 | threadpoolctl | 3.0.0 | threadpoolctl | 3.0.0   |   | tomli | 2.0.0 |   |   toolz | 0.11.1 | toolz | 0.11.1 | toolz | 0.11.1 tornado | 6.1 | tornado | 6.1 | tornado | 6.1 tqdm | 4.62.3 | tqdm | 4.62.3 | tqdm | 4.62.3 traitlets | 5.1.1 | traitlets | 5.1.1 | traitlets | 5.1.1 typing_extensions | 4.0.1 | typing_extensions | 4.0.1 | typing_extensions | 4.0.1 umap-learn | 0.5.2 | umap-learn | 0.5.2 | umap-learn | 0.5.2 urllib3 | 1.26.7 | urllib3 | 1.26.7 | urllib3 | 1.26.7 wcwidth | 0.2.5 | wcwidth | 0.2.5 | wcwidth | 0.2.5 webencodings | 0.5.1 | webencodings | 0.5.1 | webencodings | 0.5.1 wheel | 0.37.1 | wheel | 0.37.1 | wheel | 0.37.1 widgetsnbextension | 3.5.2 | widgetsnbextension | 3.5.2 | widgetsnbextension | 3.5.2 win-inet-pton | 1.1.0 | win-inet-pton | 1.1.0 | win-inet-pton | 1.1.0 wincertstore | 0.2 | wincertstore | 0.2 | wincertstore | 0.2 wrapt | 1.13.3 | wrapt | 1.13.3 | wrapt | 1.13.3 xlrd | 1.2.0 | xlrd | 1.2.0 | xlrd | 1.2.0 yarl | 1.7.2 | yarl | 1.7.2 | yarl | 1.7.2 zict | 2.0.0 | zict | 2.0.0 | zict | 2.0.0 zipp | 3.7.0 | zipp | 3.7.0 | zipp | 3.7.0 These packages are different among these 3 PCs : attrs | 21.2.0 -- | -- frozendict | 2.1.3 frozenlist | 1.2.0 gtfparse | 1.2.1 infercnvpy | 0.2.0 pycairo | 1.20.1 pytoml | 0.1.21 scikit-misc | 0.1.4 setuptools-scm | 6.3.2 tomli | 2.0.0
hyjforesight commented 2 years ago

I installed these packages on PC1, UMAP still not consistent with others. <html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

attrs | 21.2.0 -- | -- frozendict | 2.1.3 frozenlist | 1.2.0 gtfparse | 1.2.1 infercnvpy | 0.2.0 pycairo | 1.20.1 pytoml | 0.1.21 scikit-misc | 0.1.4 setuptools-scm | 6.3.2 tomli | 2.0.0

ivirshup commented 2 years ago

Do you know which step of the script the results start differing? That would help in cutting down where the issue is occurring. If not, it would be useful if you could share objects with different results from the various machines. You could use the sc.datasets.pbmc3k for this (if your data is private).

Would you also be able to share the output of numba -s from each of these environments? Different CPUs can give different results from numba code due to the features available.

Ping resident windows expert @Koncopd

hyjforesight commented 2 years ago

Hello @ivirshup @Koncopd I uploaded the objects from these 3 PCs here (https://github.com/hyjforesight/Scanpy.git), including numba -s inside. Could you please help me to check when you have time? Thanks! Best, YJ

Koncopd commented 2 years ago

@ivirshup and is umap actually reproducible on linux? I remember there were problems too.

ivirshup commented 2 years ago

Most of the time? There is an issue with fairly old CPUs (no AVX2, so like >5 years), but that was the last I saw.

My guess is that there are more reproducibility issues on windows than linux, likely because it is tested less.

I would like to confirm that it's UMAP and not the PCA though. After that could be worth checking the threading (e.g. reduce to one thread, though I thought UMAP should be as reproducible as possible w.r.t. threading by default).

ivirshup commented 2 years ago

@hyjforesight could you save the adata from the end of each of those notebooks and upload it as well?

hyjforesight commented 2 years ago

Hello @ivirshup @Koncopd I uploaded the three h5ad files into my repo (https://github.com/hyjforesight/Scanpy.git). Sorry, they're big. So I made as several subpackages. Thanks! Best, YJ image

ivirshup commented 2 years ago

@hyjforesight, I'm not sure how to read those files.

Could you provide some instructions or figure out a different way to share those? Also, are you using one of the compression filters when you write your file?

hyjforesight commented 2 years ago

Hello @ivirshup , Can you upzip these files? Please download all 7 files (.zip and .z01-.z06), and use winzip, winrar, 7-zip or bandizip to unzip the Downloads.zip file. It includes three h5ad files. I also share these three h5ad files into your gmail.

I use adata.write('C:/Users/Park_Lab/Documents/PC1.h5ad', compression='gzip') to write these files. And use adata = sc.read('C:/Users/Park_Lab/Documents/PC1.h5ad') to read these files.

Thanks!

ivirshup commented 2 years ago

X

So, the first and third result have the same values of X.

assert np.array_equal(pc1.X, pc3.X)

The second is only slightly different:

diff = pc2.X - pc1.X
diff[diff != 0]
array([7.450581e-09], dtype=float32)

This might be adjustable by setting "regress out" to a fixed number of jobs.

PCA

The results of the PCA differ more significantly, but here you should just be calling scikit-learn's implementation.

Could you try calling that directly and letting us know the results?

E.g.

from sklearn.decomposition import PCA

pca = PCA(n_components=50, solver="arpack", random_state=0)
result = pca.fit_transform(adata.X)

If this doesn't give consistent results on your machines, the issue is upstream in scikit-learn.

If you need reproducibility now, I would suggest switching out the solver for the PCA and/ or using 64 bit values for X.

hyjforesight commented 2 years ago

Hello @ivirshup Thanks for checking these data!

Could you try calling that directly and letting us know the results?

Where shall I call this code? I call it at the end of the notebook. I uploaded the results to google drive and shared them with you by gmail. Please check.

I would suggest switching out the solver for the PCA and/ or using 64 bit values for X.

To switch out the solver, did you mean that I should delete svd_solver='arpack' in sc.tl.pca(adata, svd_solver='arpack')? To use 64bit value, shall I call this adata.X = adata.X.astype('float64')?

Thanks! Best, YJ

ivirshup commented 2 years ago

Where shall I call this code?

I would call it on the same exact same X on different machines. E.g. something like:

from hashlib import sha256
import anndata as ad
from sklearn.decomposition import PCA

adata = ad.read_h5ad("PC1.h5ad")
print(sha256(adata.X).hexdigest())

pca = PCA(n_components=50, solver="arpack", random_state=0)

print(sha256(pca.fit_transform(adata.X)).hexdigest())

The first hash should be the same on all systems, while I would expect the second to vary if the PCA isn't reproducing.

To switch out the solver, did you mean that I should delete svd_solver='arpack' in sc.tl.pca(adata, svd_solver='arpack')?

I think you should specify a different one, like "lob_pcg" or "randomized", but you can pass this directly to PCA(..., solver=...) too.

To use 64bit value, shall I call this adata.X = adata.X.astype('float64')

Yes.

hyjforesight commented 2 years ago

Hello @ivirshup , I met the errors.

from hashlib import sha256
import anndata as ad
from sklearn.decomposition import PCA

adata = ad.read_h5ad("C:/Users/Park_Lab/Documents/PC1.h5ad")
print(sha256(adata.X).hexdigest())

pca = PCA(n_components=50, svd_solver="arpack", random_state=0)

print(sha256(pca.fit_transform(adata.X)).hexdigest())

ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_3908/912381655.py in <module>
      8 pca = PCA(n_components=50, svd_solver="arpack", random_state=0)
      9 
---> 10 print(sha256(pca.fit_transform(adata.X)).hexdigest())

ValueError: ndarray is not C-contiguous

change to this, same error.


from hashlib import sha256
import anndata as ad
from sklearn.decomposition import PCA

adata = ad.read_h5ad("C:/Users/Park_Lab/Documents/PC1.h5ad")
print(sha256(adata.X).hexdigest())

pca = PCA(n_components=50, svd_solver="arpack", random_state=0)
a=adata.X.copy(order='C')

print(sha256(pca.fit_transform(a)).hexdigest())

ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_3908/1058352035.py in <module>
      8 pca = PCA(n_components=50, svd_solver="arpack", random_state=0)
      9 a=adata.X.copy(order='C')
---> 10 print(sha256(pca.fit_transform(a)).hexdigest())

ValueError: ndarray is not C-contiguous
hyjforesight commented 2 years ago

Hello @ivirshup , sorry for asking. Is there any update on this issue? Thanks!