I am trying to subset my bam file for each barcode. I have around 20k cells and each of their barcode is in a directory. I have been using the code below to execute subset-bam. It takes around 35 minutes per barcode. I was wondering if there is a way to make subset-bam run any faster, perhaps parallelization?
FILES="my directory containing every barcode"
for file in $FILES
do
filename=$(basename "$file")
filename_no_extension="${filename%%.*}"
Hello,
I am trying to subset my bam file for each barcode. I have around 20k cells and each of their barcode is in a directory. I have been using the code below to execute subset-bam. It takes around 35 minutes per barcode. I was wondering if there is a way to make subset-bam run any faster, perhaps parallelization?
FILES="my directory containing every barcode" for file in $FILES do filename=$(basename "$file") filename_no_extension="${filename%%.*}"
subset-bam_linux --bam marked.duplicates.bam --bam-tag CB --cell-barcodes barcodes/$filename --out-bam barcode_bams/$filename_no_extension.bam done