Open maximilianh opened 7 years ago
This effect is due to the way the threshold is determined using the Savitzky-Golay filter followed by looking for the neg. maximum diff umi counts. If the curve does not have a maximum value in the window compared to the value at the first value of the window, it will find 500 cells as threshold if a window of 500 to 5000 has been specified. We also found this behavior for several of our 10x Genomics runs. As far as we see, it could be related to the species your samples are from. For human and mice sample, the described strategy works usually (unless you have a bad sample). The only way we see is to change the script and use another thresholding strategy.
We have written a perl wrapper and adapted the python scripts from pachterlab to work with Chromium 10X chemistry v1 and v2. All is available on our github https://github.com/vibbits/sc_read_kallisto_wrapper. The thresholding strategy is finally taken over from CellRanger's thresholding strategy. Future work: we could look into the potentially faster code from the other fork and adapt it to work with chem v1 and chem v2, too
Hi, I'm confused by this output of the first script:
get_cell_barcodes.py
It sounds fishy to me that it always detects whatever I specify as the lowest number of cells. If I set the window [500,5000] cells, it will find 500 cells. If I set it to 100,5000, it will find 100 cells. Is this expected? If not, any ideas what I'm doing wrong?
thanks! Max