Closed bapoorva closed 3 years ago
Hi Apoorva,
you can calculate the number of UMIs per cell for the filtered data by summing the 3rd column in the filtered/matrix.mtx file for each of the cell barcode indexes in column 2 (skipping 3 header lines). Then you can plot it on the same plot with the UMIperCellSorted.txt with different colors - this should reproduce the cell/background plot.
I also added an awk script: https://github.com/alexdobin/STAR/blob/master/extras/scripts/calcUMIperCell.awk usage:
awk -f calcUMIperCell.awk raw/matrix.mtx raw/barcodes.tsv filtered/barcodes.tsv | sort -k1,1rn > UMIperCell.txt
It outputs two columns:
column1 = total UMIs per cell
column2 = 1 for cell that passed filtering, 0 otherwise
You can also run it with just the filtered matrix usage:
awk -f calcUMIperCell.awk filtereed/matrix.mtx | sort -k1,1rn > UMIperCell.txt
and it will output the counts just for the filtered cells (as described in the beginning of the post).
Cheers Alex
Hi Alex,
Apologies on reviving an old thread, but I'm having trouble getting this script to return anything! I see the script is now named soloUMIperCell.awk in https://github.com/alexdobin/STAR/blob/master/extras/scripts/ and have used it both on my raw and filtered snRNAseq data only to get empty files.
Is the script still compatible with newest version of STARsolo?
For reference, my matrices look like so:
johnbriseno@Johns-MacBook-Pro filtered % head matrix.mtx %%MatrixMarket matrix coordinate integer general % 39008 7369 4809708 36 1 1 38 1 1 41 1 1 75 1 2 94 1 1 132 1 1 148 1 1
And my barcodes look like this:
johnbriseno@Johns-MacBook-Pro filtered % head barcodes.tsv AAACCCAAGGTTAAAC AAACCCAAGTAGTGCG AAACCCAAGTATGAGT AAACCCACAAGGGCAT AAACCCACAGTCCGTG AAACCCAGTCATCCCT AAACCCAGTTGCGGCT AAACCCATCTGTCTCG AAACGAAAGCAGGCTA AAACGAAAGCGGTAGT
From a scRNAseq workshop I took, I was able to generate a UMI vs Cell plot with cell density on y axis in log scale and UMIs on the x axis, but I'm specifically looking to generate a knee plot, something we didn't cover in the tutorial. Any help would be much appreciated and thanks once again for fantastic tools! -JB
Hi @Lil-Psilocybe
The script should work. You can also write your own: the script simply sums the counts in each cell from the matrix.mtx, and then adds the barcode sequences from barcodes.tsv.
Hello!
Thanks for your reply, I'll give it a shot!
-JB
Hi,
The new version of STARSolo perfectly mimics the cellranger output. But I was wondering if there was a way to get QC stats like we do from cell ranger.
This is sort of addressed here #660 . I have the summary.csv and UMIperCellSorted.txt. But i'm interested in the barcode vs umi plot with different colors for cell and background as in the web_summary.html file from cell ranger. The UMIperCellSorted.txt, just has the UMI and there is no way to tell if it is a cell or background. Any way to work around it ?
Thanks Apoorva