dpeerlab / Harmony

Harmony framework for connecting scRNA-seq data from discrete time points
GNU General Public License v2.0
44 stars 12 forks source link

Meaning and function of the `n_components` parameter #13

Open Marius1311 opened 4 years ago

Marius1311 commented 4 years ago

Harmony has a n_components parameter, which, according to the docstring:

:param pc_components: Minimum number of principal components to use. Specify `None` to use pre-computed components

That value is used for utils.run_pca, but it's not passed over to scanpy's neighboor computation, see https://github.com/dpeerlab/Harmony/blob/eca0771348e7f1b901f95d1a8fc68d95530e830c/src/harmony/core.py#L67

So I wonder what the significance of that parameter actually is?

Also, I find the default value of 1000 a bit high, as scanpy's default here is much smaller, 50 I believe

ManuSetty commented 4 years ago

The n_components parameter is used below https://github.com/dpeerlab/Harmony/blob/eca0771348e7f1b901f95d1a8fc68d95530e830c/src/harmony/core.py#L48 to compute the principal components.

In line 67, temp is already pre-computed principal components and thus the parameter is not passed for nearest neighbor computation.