epi2me-labs / wf-amplicon

Other
16 stars 5 forks source link

aligned.sorted.bam file question #12

Closed arugula2388 closed 4 months ago

arugula2388 commented 5 months ago

Ask away!

Can you clarify if from the output of the pipeline what is in this sorted.aligned.bam file.

I am looking for an output bam file of the reads that aligned to the amplicon of interest and those that didn't; I can get the total numbers from the indexed bam sorted aligned.

Second questions: what is the best way to extract 10x genomic barcodes from these?

julibeg commented 5 months ago

Hi @arugula2388,

I am looking for an output bam file of the reads that aligned to the amplicon of interest and those that didn't; I can get the total numbers from the indexed bam sorted aligned.

sorted.aligned.bam contains the alignments against the target amplicons. What do you mean by "and those didn't"? One thing to keep in mind is that if the sequence IDs of your amplicons in the reference FASTA file contained special characters, these will have been replaced with underscores.

Second questions: what is the best way to extract 10x genomic barcodes from these?

Could you elaborate on this please? The workflow is not intended to be run on single-cell data.

arugula2388 commented 5 months ago

We have performed targeted enrichment for a gene of interest and done the long read sequencing with ONT; and separately we have scGEX data. The read structure is expected to have p5,i5,r1,cellbarcode from 10x, UMI,TSO, gene of interest, gene of interest primer region( used)

nrhorner commented 5 months ago

@arugula2388

Take a look at https://github.com/epi2me-labs/wf-single-cell for extracting 10x barcodes