slowkow / harmonypy

🎼 Integrate multiple high-dimensional datasets with fuzzy k-means and locally linear adjustments.
https://portals.broadinstitute.org/harmony/
GNU General Public License v3.0
192 stars 22 forks source link

Error when trying to run Harmony with multiple covariates #4

Closed liboxun closed 4 years ago

liboxun commented 4 years ago

From reading the documentation at https://github.com/immunogenomics/harmony, it seems that feeding a list of multiple covariates to the argument vars_use would harmonize the data over these covariates. However when I tried to do this using:

meta_data = adata.obs
vars_use = ['batch', 'library_name']
data_mat = adata.obsm['X_pca']
ho = hm.run_harmony(data_mat, meta_data, vars_use)

where adata is my dataset in AnnData format, I got this error:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-91-b088ad1e792b> in <module>
----> 1 ho = hm.run_harmony(data_mat, meta_data, vars_use)

~/.conda/envs/py37_TOC_PAGA/lib/python3.7/site-packages/harmonypy/harmony.py in run_harmony(data_mat, meta_data, vars_use, theta, lamb, sigma, nclust, tau, block_size, max_iter_harmony, max_iter_cluster, epsilon_cluster, epsilon_harmony, plot_convergence, return_object, verbose, reference_values, cluster_prior)
     56     for i in range(len(categories.categories)):
     57         ix = categories == categories.categories[i]
---> 58         phi[i,ix] = 1
     59 
     60     N_b = phi.sum(axis = 1)

IndexError: boolean index did not match indexed array along dimension 1; dimension is 23381 but corresponding boolean dimension is 2

Was there something I overlooked? Please advise. Thanks!

slowkow commented 4 years ago

Thanks for reporting this issue!

You're running harmonypy version 0.0.3, but that version does not support multiple covariates.

Here's the code from version 0.0.3:

https://github.com/slowkow/harmonypy/blob/76c5748c80a28ea5a29398de03adc4d41f3ef8fe/harmonypy/harmony.py#L56-L58

The new code should support multiple covariates. I just pushed version 0.0.4 to pypi right now.

I started tracking changes in CHANGELOG.md, so I hope that helps to see what has changed between versions.

Right now, you should be able to upgrade to 0.0.4 with:

pip install harmonypy

If you want to try the latest code available on GitHub, try:

pip install git+https://github.com/slowkow/harmonypy

Please let me know if you run into any other issues!

liboxun commented 4 years ago

Appreciate the quick response the update! I'll post new results when I have them.

liboxun commented 4 years ago

This update (0.0.4) fixed it! Thanks!