Open marencc opened 4 years ago
Hey,
First of all thanks for developing Kallisto-Bustools for scRNA-seq, it is super useful!!
I just wanted to follow @marencc and ask if there is Kallisto bustools is suited for 10x 5' chemistry?
Best
Kike
Yes, it is suitable for 5' chemistry and should be no different than 10X v2 chemistry (i.e. your first fastq file contains your 16bp barcode and 10bp UMI and your second fastq file contains your biological read that you wish to map to the transcriptome).
Happy to answer any questions if you run into any problems with the workflow.
Hey,
Thanks a lot for your help and your fast answer! My question is more about if Kallisto bus is able to work with 10x 5' in paired-end mode. So for example, a lot of 10x 5' GEX is in paired-end form, with 150 bp in R1 (16bp CB + 10bp UMI + cDNA) and 150bp in R2 (all cDNA). This would be the format 'SC5P-PE' in 10x (See:https://kb.10xgenomics.com/hc/en-us/articles/115003764132-How-does-cellranger-count-auto-detect-chemistry-)
It that helps, I have seen that STARsolo recently supports this too: https://github.com/alexdobin/STAR/blob/master/docs/STARsolo.md#barcode-and-cdna-on-the-same-mate https://github.com/alexdobin/STAR/issues/768
Thanks for any help you might provide about this!!
Best,
Kike
Thanks for the follow up clarification. I see you've tried a few things here: https://github.com/pachterlab/kallisto/issues/226#issuecomment-931297217 (issue #226 )
You can indeed specify multiple sequences in the technology string, however, you must use a comma rather than a colon; e.g. 0,0,16:0,16,26:0,26,0,1,0,0
Let me know if this works for you and if you have further questions
Hey,
Thanks a lot for your fast reply!! I can confirm it works now, thanks a lot!!!
Best,
Kike
Sorry to come back on this, but I am also trying to count SC5P-PE data with kallisto|bustools.
The read scheme is 150 x 150: R1: 16xBC,10xUMI,124xcDNA ------------------- R2: 150xcDNA
Given that it seems to work for me as well, Could you please explain the rationale behind: 0,0,16:0,16,26:0,26,0,1,0,0 or direct me to a manual section where this nomenclature is explained?
Many thanks, Giovanni
I think that this is done in order to identify from where to where goes your Cell Barcode, UMI and cDNA sequence. So the first part of your string (0,0,16) indicates that the Cell Barcode is in the first fast file (R1) and starts at position 1 (0 often in computing) and goes until position 16. And so on....you can see the information here: https://pachterlab.github.io/kallisto/manual
Hi!
Many thanks for all you development efforts!
I have six data sets that I would be interested in benchmarking with Kallisto-bustools, however I just found out that kalllisto bus only supports 3' chemistry. Will you be supporting 5'PE anytime soon (SC5P-PE )? Is there a turnaround for this situation?
List of supported single-cell technologies
short name description
10xv1 10x version 1 chemistry 10xv2 10x version 2 chemistry 10xv3 10x version 3 chemistry CELSeq CEL-Seq CELSeq2 CEL-Seq version 2 DropSeq DropSeq inDrops inDrops SCRBSeq SCRB-Seq SureCell SureCell for ddSEQ