Closed ColeWunderlich closed 2 years ago
Hi @ColeWunderlich ,
Thanks for using minnow, and apologies for catching up late. I am going through the issues and trying to address them as best as I can,
I just went to the code to see if we write the original UMI, and we don't report that. We only keep https://github.com/COMBINE-lab/minnow/blob/minnow-velocity/src/MinnowSimulate.cpp#L843 the original Cell barcode name. But of course this can be reported. It could result in a bigger file size.
I can certainly push the change to have that, are you generating the data from Splatter, and using the custom cell barcode names? Then I will change it accordingly.
Thanks Hirak
Hey @hiraksarkar thanks for getting back to me.
It would be great to have the true UMI in the read name and I appreciate you being willing to add it as a feature. I'm guessing this would be tagSeq
in the code you linked to? Also, I have been assuming that the modifiedCellName
being output here has no PCR error in it's sequence, is that correct?
Yes, so far I have been using splatter mode. I want to use the output from an alevin run, but so far I get zero genes whenever I run in Alevin mode. My work around has been to transpose the Alevin matrix, convert it to a csv (which takes hours for pandas to write to disk), and then feed that into minnow in splatter mode with the --custom
flag. (I also switch the row and column files so that they match the transposed matrix).
Hi @ColeWunderlich ,
I have mostly developed the splatter-based developed because that has become the favorable choice for most of the users. What you did by transposing and using the --custom
flag should be right.
modifiedCellName
should be the unchanged version of cell, I just added UMI in my latest commit to it. Let me know if that works.
Thanks
Hey @hiraksarkar,
Sorry for taking so long to get back to you.
That makes sense. From reading some of the old issues I got the impression that splatter-mode
was the only mode currently supported.
I applied the new fix and performed a test run, everything looks like it is working as expected. Thanks for the update!
I also have a few questions about how to run minnow properly. Would this thread be a good place to discuss them?
Hi @ColeWunderlich ,
Feel free to ask here or drop me an email at hiraksarkar.cs@gmail.com, we can chat more. If you want to discuss the modes or modifications, I won't close the issue. Please let me know.
Thanks
Closing this due to inactivity.
Hello,
Is there a way to get the true UMI sequence for each read?
So far after running minnow my read names look like this (example from R1 file):
The cell barcode is clearly in the read name, but there is no information about the true UMI sequence for the read. I also cannot find this information anywhere else in the minnow output.
This information seems critical since the observed UMI sequence may contain PCR errors.
Is this a bug or does minnow not normally give you information about the true UMI sequence for each read?