alexdobin / STAR

RNA-seq aligner
MIT License
1.87k stars 506 forks source link

Add RG ID to cell barcodes #2244

Open drneavin opened 2 days ago

drneavin commented 2 days ago

Hi Alex,

I've been using STARsolo to align a variety of different single-cell technologies and found the diverse functionality very helpful. I've run into a scenario that I can't figure out how to resolve. For CEL-seq2, there are typically up to 192 barcodes used to indicate different cells but each sample might have multiple plates collected with each in a different fastq file but with the same 192 barcodes.

Ideally, I'd like to align all the cells from the same donor in a single STARsolo run so all cells for that sample are in a single bam file and cell x umi matrix but this would require some type of appending of the barcode with a plate identifier.

I tried with --outSAMattrRGline which adds a different RG identifier to the bam for each fastq file pair but it doesn't seem that the RG IDs are used when generating the cell x umi matrix as I only end up with 192 barcodes. Is there a way to do this?

Thanks for your help! -Drew