snehamitra / SCARlink

32 stars 6 forks source link

Models aren't attempted for >90% of genes? #12

Open danieljrichard opened 1 week ago

danieljrichard commented 1 week ago

Hello, I've been trying to get SCARlink to work on my datasets. In the Seurat object, I manually set the VariableGenes slot to contain a set of ~1200 marker genes. Preprocessing seems to work, as hvg.txt contains the list of 1200 genes. However, running scarlink itself, the log only appears to attempt ~35 genes, of which only 4 were modelled (it appears for sparsity reasons).

Is there perhaps I'm doing something wrong? Or some silent filter in scarlink that tosses out the majority of variable genes? Any help would be greatly appreciated.

Daniel

snehamitra commented 1 week ago

We don't filter out any genes besides the sparsity threshold. Could you check the number of genes in your coassay_matrix.h5 file?

import h5py
f = h5py.File("output_dir/coassay_matrix.h5", 'r')
print(f.keys())
print(len(f.keys()))
f.close()