Open michael-weinstein opened 7 years ago
You could try increasing the barcode frequency threshold that's used as a trigger to create a file. See line 47 of: https://github.com/aryeelab/umi/blob/3fef4c92becda4c2b4b6085555415f80c1dd858e/demultiplex.py (I can't remember off-hand if it's possible to set this from the command line)
I'll use that method. I was just wondering if there is a more optimal method for dealing with that issue (or you had an interest in dealing with it another way). I was just concerned, since it seems like a sub-optimal method to deal with this issue.
I'm running into this same issue. I've made a fix in my fork of the repo. Do you have a contributor policy?
@JudoWill How did you fix the demultiplexing step?
I am getting an error for having too many open files during demultiplex. It looks like the demultiplexer is making files for every barcode it sees with some frequency as opposed to using my list of barcodes and trying to either split them into known barcodes or mark them as unidentifiable.
Is there some preprocessing I should have done on the raw fastq file? Alternatively, I have some code of my own that might be able to help out with this problem by calling barcodes based on expected sequences.
Mike