AllenInstitute / mfishtools

Building Gene Sets and Mapping mFISH Data
Other
16 stars 6 forks source link

medianExpr[runGenes,kpClust] out of bounds #7

Closed peachgong closed 3 years ago

peachgong commented 3 years ago

First of all, thanks very much for sharing this tool. I met some situations when I went through some of my data, it shows errors for the following command.

clusterDistance <- as.matrix(corDist(medianExpr[runGenes,kpClust])) Error in medianExpr[runGenes, kpClust] : subscript out of bounds

I checked here medianExpr is a matrix where columns are the cell clusters (22 columns) and rows are feature names (26214 features). runGenes are characters containing 253 specific feature names. kpClust is character containing 11 cluster names. I am pretty new in R, so not sure why this error arises. Any idea?

jeremymiller commented 3 years ago

The most likely issue is that runGenes includes at least one feature name not in medianExpr or kpClust includes at least one name (or index) not in medianExpr.

peachgong commented 3 years ago

Should be like that. But I couldn't figure out how.

kpClust is generated as kpClust = sort(unique(cl[kpSamp])) and only containing 11 cluster names, so I manually checked every character inside. It should be correct.

For runGenes, it is generated by filterPanelGenes. I almost didn't change any parameters when using the codes from your vignette.

peachgong commented 3 years ago

Solved. It is interesting that when I delete one of genes from the startingGenePanel, the error is gone. The gene I delete is one of the most variable genes (top10) from all the clusters and actually a good marker gene for a very small cluster.

Does this mean I should always input highly expression and lower variable gene as startingGene before un filterPanelGenes?

jeremymiller commented 3 years ago

It shouldn't matter what genes are in the starting panel so long as the genes are included in the median table (they may have been filtered earlier).