Closed marlmatos closed 3 months ago
just looking at line 60: U[cb][ge][UB]++;
, I assume that you need gawk
for support for nested arrays.
samtools view scGEX-A29.sortedByCoord.filtered.out.bam | gawk -v fileWL=scGEX-A29.Solo.out/Gene/raw/barcodes.tsv \
-v fileGenes=scGEX-A29.Solo.out/Gene/raw/features.tsv \
-f soloCountMatrixFromBAM.awk | sort -k2,2n -k1,1n > scGEX-A29.matrix.mtx
This script is just an example of how to process the data. You may want to re-write it for your own needs, maybe in python.
Thanks, I will try that
Hi Alex, Thanks for always being so helpful. I am trying to recreate the counts matrix of STARsolo with the soloCountMatrixFromBAM.awk that you have posted here. For context, I am filtering the AlignedSorted.out.bam for autosomes and reads that passed WASP and those that do not have the WASP SAMtAg. But i keep getting syntax errors with the awk script, and I cannot find where the error is coming from. Here is how I am using the script:
And this is the error:
awk: soloCountMatrixFromBAM.awk:60: Unexpected token
I also redownloaded the script to make sure it was not a typo that I had introduced. But the error persists.For reference, this is how I am running STAR
and another question: Is this the right way that you would go about this? I thought that using the raw/barcodes.tsv and raw/features.tsv would give me a raw matrix? But how do I get to the filtered /matrix.mtx? Shouuld I use the filtered/barcodes.tsv and filtered/features.tsv instead?
Since the soloCountMatrixFromBAM.awk was not working for me, I opted then to use the STARsolo mapped BAM as input again for STAR, for remapping. I am still dealing with errors on that part, but I figured if you can help me with this, I would not have to do that... Thanks again!
Marliette