Closed SidG13 closed 1 year ago
Hi @SidG13,
I've run into the same issue so i was wondering how you ended up fixing/resolving this issue? I've had a similar problem arising from the fact that i've quite a few cell types with small numbers of cells.
Thanks in advance.
Hi both
The missing values are due to these regions being filtered out, i.e. they don't pass the adjpval_thr
(0.05) and log2fc_thr (log2(1.5)) thresholds. You could adapt these thresholds.
However you say it also causes downstream issues, this is not ideal. I will put a pin in this and check what happens if some values are empty. Preferably the code should check for this, but we might have missed it in certain parts.
Best,
Seppe
Hey sorry this is incredibly late, I actually did this solution, using the if statement. This was @SeppeDeWinter solution from another thread, I can't find it now.
import pyranges as pr
from pycistarget.utils import region_names_to_coordinates
region_sets = {}
region_sets['topics_otsu'] = {}
region_sets['topics_top_3'] = {}
region_sets['DARs'] = {}
for topic in region_bin_topics_otsu.keys():
regions = region_bin_topics_otsu[topic].index[region_bin_topics_otsu[topic].index.str.startswith(('Scaffold'))] #only keep regions on known chromosomes
region_sets['topics_otsu'][topic] = pr.PyRanges(region_names_to_coordinates(regions))
for topic in region_bin_topics_top3k.keys():
regions = region_bin_topics_top3k[topic].index[region_bin_topics_top3k[topic].index.str.startswith(('Scaffold'))] #only keep regions on known chromosomes
region_sets['topics_top_3'][topic] = pr.PyRanges(region_names_to_coordinates(regions))
for DAR in markers_dict.keys():
regions = markers_dict[DAR].index[markers_dict[DAR].index.str.startswith(('Scaffold'))] #only keep regions on known chromosomes
if len(regions) > 0:
region_sets['DARs'][DAR] = pr.PyRanges(region_names_to_coordinates(regions))
Hi, this isn't an issue with the program itself but I'm just trying to understand better ways to move forward with my data.
I'm running SCENIC+ on my own multiome data and it looks fine until I calculate DARs. The imputed sparsity is very low which may be one issue (running the default command):
Imputed accessibility sparsity: 0.00243018816125784
And some of my cell clusters (clusters 0 and 1) are empty in the dictionary value:
Having empty values will cause downstream issues. I was wondering if anyone had any input on why this might be happening or if there are reasonable ways to fix this without affecting the data interpretation too much? I'm not too familiar with cistopic so any help with this would be much appreciated! Thanks