yelabucsf / scrna-parameter-estimation

Direct estimation of mean and covariance from single cell RNA seq experiments
MIT License
76 stars 6 forks source link

Error when one condition has absolutely zero counts #35

Open adrianthj opened 1 week ago

adrianthj commented 1 week ago

Hi! Thank you so much for the new DEG tool! I've been trying it on my dataset, and I noticed that in situations when one of the conditions have zero counts. I'm guessing it's because it's trying to divide by zero. Would you recommend I add an arbitarily small number like 0.0001 to all the raw counts for every cell to resolve this problem? Was wondering if doing that may affect the calculations downstream.

Is there also a good way to calculate the ideal number of bootstrap operations to be used?

Thank you so much and hope to hear from you soon!

Here's my code for reference! I'm using some of the commands directly instead of the wrapper as I'm hoping for a different filter percentage.

a = (adata.obs['organism'] == 0).sum() b = (adata.obs['organism'] == 1).sum() c = (adata.obs['organism']).sum()

d = min(a/c, b/c)

adata.obs['capture_rate'] = 0.25 memento.setup_memento(adata, q_column='capture_rate') memento.create_groups(adata, label_columns=['organism']) memento.compute_1d_moments(adata, min_perc_group=d/1.5) sample_meta = memento.get_groups(adata)[['organism']] memento.ht_1d_moments( adata, treatment=sample_meta, num_boot=5000, verbose=1, num_cpus=3) result_1d = memento.get_1d_ht_result(adata)