Open jesswhitts opened 1 month ago
Dear @jesswhitts
I suspect it might be the LDA. I'll have a look soon.
Hi @Zethson - any update on this?
Sorry, I might have time for this on Friday. If it's urgent for you, you can check all function calls individually whether you always get the same results or not. Then you can tell me where the differences come from and I can try to get rid of them.
I'm very busy at the moment.
Hi @Zethson
I think the issue is with the mixscape function itself, not the LDA. Across 4 runs there are a few cells which are classified differently. Does the random_state parameter need setting in GaussianMixture maybe? I tried setting a global numpy random state, but this didn't work...
Thanks! @Lilly-May will have a look by the end of the week. Apologies again for taking so long!
Hi @Lilly-May,
Have done some of my own testing, and think that setting the GaussianMixture random_state does indeed fix this issue, I amended the following parts of the mixscape function accordingly:
def mixscape(
self,
adata: AnnData,
labels: str,
control: str,
new_class_name: str | None = "mixscape_class",
min_de_genes: int | None = 5,
layer: str | None = None,
logfc_threshold: float | None = 0.25,
iter_num: int | None = 10,
random_init: int | None = 1,
split_by: str | None = None,
pval_cutoff: float | None = 5e-2,
perturbation_type: str | None = "KO",
copy: bool | None = False,
):
mm = GaussianMixture(
n_components=2,
random_state=random_init,
covariance_type="spherical",
means_init=means_init,
precisions_init=precisions_init,
).fit(np.asarray(pvec).reshape(-1, 1))
I hope these changes are appropriate :)
Best, Jess
Report
Hello,
When running the lda visualisation plot as part of the mixscape workflow, I am not getting consistent results between runs, the plot looks slightly different (although very similar). Is a random seed used at any point, and is there a way to set this so I can get reproducible results?
Code used:
ms = pt.tl.Mixscape() ms.perturbation_signature(mdata["rna"], pert_key="perturbation", control="NT")
adata_pert = mdata["rna"].copy() adata_pert.X = adata_pert.layers["X_pert"]
sc.pp.pca(adata_pert) sc.pp.neighbors(adata_pert, metric="cosine") sc.tl.umap(adata_pert)
ms.mixscape(adata=mdata["rna"], control="NT", labels="guide_allocation", layer="X_pert", logfc_threshold = 0.15, iter_num=100, min_de_genes=5) ms.lda(adata=mdata["rna"], control="NT", labels="guide_allocation", layer="X_pert") ms.plot_lda(adata=mdata["rna"], control="NT")
Version information
adjustText 1.2.0 anndata 0.10.8 matplotlib 3.9.2 mudata 0.3.1 muon 0.1.6 numpy 1.26.4 pandas 2.2.2 pertpy 0.9.4 scanpy 1.10.3 session_info 1.0.0
PIL 10.4.0 absl NA appnope 0.1.4 asttokens NA attr 24.2.0 blitzgsea NA certifi 2024.08.30 charset_normalizer 3.3.2 chex 0.1.86 comm 0.2.2 contextlib2 NA cycler 0.12.1 cython_runtime NA dateutil 2.9.0 debugpy 1.8.5 decorator 5.1.1 decoupler 1.8.0 docrep 0.3.2 equinox 0.11.7 etils 1.9.4 executing 2.1.0 filelock 3.16.1 flax 0.9.0 fsspec 2024.9.0 h5py 3.11.0 idna 3.10 igraph 0.11.6 ipykernel 6.29.5 jax 0.4.33 jaxlib 0.4.33 jaxopt NA jaxtyping 0.2.34 jedi 0.19.1 joblib 1.4.2 kiwisolver 1.4.7 lamin_utils 0.13.4 legacy_api_wrap NA leidenalg 0.10.2 lightning 2.4.0 lightning_fabric 2.4.0 lightning_utilities 0.11.7 lineax 0.0.5 llvmlite 0.43.0 matplotlib_inline 0.1.7 ml_collections NA ml_dtypes 0.5.0 mpl_toolkits NA mpmath 1.3.0 msgpack 1.1.0 multipledispatch 0.6.0 natsort 8.4.0 numba 0.60.0 numpyro 0.15.3 opt_einsum v3.3.0 optax 0.2.3 ott 0.4.8 packaging 24.1 parso 0.8.4 patsy 0.5.6 pickleshare 0.7.5 platformdirs 4.3.6 ply 3.11 prompt_toolkit 3.0.47 psutil 6.0.0 pubchempy 1.0.4 pure_eval 0.2.3 pyarrow 17.0.0 pydev_ipython NA pydevconsole NA pydevd 2.9.5 pydevd_file_utils NA pydevd_plugins NA pydevd_tracing NA pygments 2.18.0 pynndescent 0.5.13 pyomo 6.8.0 pyparsing 3.1.4 pyro 1.9.1 pytorch_lightning 2.4.0 pytz 2024.2 requests 2.32.3 rich NA scipy 1.14.1 scvi 1.1.6.post2 seaborn 0.13.2 six 1.16.0 sklearn 1.5.2 skmisc 0.5.1 sparsecca 0.3.1 stack_data 0.6.2 statsmodels 0.14.3 sympy 1.13.3 texttable 1.7.0 threadpoolctl 3.5.0 toolz 0.12.1 torch 2.4.1 torchgen NA torchmetrics 1.4.2 tornado 6.4.1 tqdm 4.66.5 traitlets 5.14.3 typeguard NA typing_extensions NA umap 0.5.6 urllib3 2.2.3 wcwidth 0.2.13 yaml 6.0.2 zmq 26.2.0
IPython 8.27.0 jupyter_client 8.6.3 jupyter_core 5.7.2
Python 3.12.6 | packaged by conda-forge | (main, Sep 11 2024, 04:55:15) [Clang 17.0.6 ] macOS-13.0.1-arm64-arm-64bit