MarioniLab / DropletUtils

Clone of the Bioconductor repository for the DropletUtils package.
https://bioconductor.org/packages/devel/bioc/html/DropletUtils.html
54 stars 26 forks source link

emptyDropsCellRanger index out of bounds style error #109

Open jamesnemesh opened 3 months ago

jamesnemesh commented 3 months ago

Hi!

I'm getting a crash when running emptyDropsCellRanger.

The call:

DropletUtils::emptyDropsCellRanger(dge)

Where dge is a matrix of cells by genes. I'll note this data set is compatible with your standard emptyDrops(dge) function.

The stack trace:

image

The problem with the data is that the number of cell barcodes in the data set is too low, and the program runs into issues during the call of ambient.FUN.

In particular, the default bounds for the start and end of the ambient cell barcodes: ind.min=45000 ind.max=90000

When this line is run, the function gets into trouble:

ambient[o[min(ind.min, length(totals)):min(ind.max, length(totals))]] <- TRUE

length(totals)=24101

So ambient only has a single positive entry, as the min and max resolve to the same index.

table (ambient)
ambient
FALSE  TRUE 
24100     1 

This later causes the calculated ambient profile to be pretty strange, and it contains NaN values, which are rarely a good thing:

ambient.prob[13080:13090,]
       CDC34        CDC37      CDC37L1   CDC37L1-DT        CDC40        CDC42    CDC42-AS1    CDC42-IT1 
3.130576e-05 3.130576e-05 3.130576e-05 3.130576e-05 3.130576e-05 3.130576e-05 3.130576e-05 3.130576e-05 
    CDC42BPA     CDC42BPB     CDC42BPG 
         NaN 3.130576e-05 3.130576e-05 

Eventually, the code tries

any(still.zero)

And since still.zero contains NAs (inherited from ambient.prob), the result is NA, which produces the reported error at the top of the stack.

I don't expect the program to magically find an answer to this, but it might be nice to check the bounds and throw an exception that the selected indexes are too large for the provided data set.

I'm going to add my own data sanity checks before I run emptyDropsCellRanger as well.

jonathangriffiths commented 3 months ago

Thanks for reporting this edge-case. A warning might be the most appropriate answer. I'll see if I can find the time to put it in