mflamand / Bullseye

Bullseye analysis pipeline for DART-seq analysis
MIT License
12 stars 4 forks source link

empty output from parseBAM.pl #8

Closed chilampoon closed 1 year ago

chilampoon commented 1 year ago

Hi there, I was trying to run your parseBAM.pl script on the cellranger bam of 10x single-cell short-read sequencing but the matrix outputs are all empty.. does your script need something unique from STAR's output or something else? Also if you could share more about how you process the 10x data it'd be appreciated, thank you!

mflamand commented 1 year ago

Hi, I am not the one that processed that 10x data so I am not a 100% sure about the processing, but I can refer you here: https://github.com/tegowski/scDARTHEKcells for the processing info; and here https://pubmed.ncbi.nlm.nih.gov/36042888/ for the detailed protocol.

As far as the script requirements for 10x data, the bam file should include the "CB:Z:" tag, which is the Chromium cellular barcode sequence (error corrected). I would think this is included with the cell ranger bam. However, I think the bam file used for 10x data was generated with STAR directly and not with cell ranger (see GitHub link above).

If these don't answer your questions, please let me know, I can dig further to see why it's not working. If you would want it to work on the cell ranger bam file and it does not, I can try to modify the script so it does.

chilampoon commented 1 year ago

Hi @mflamand thanks for the reply! I was actually following that repo and running the perl scripts and got empty matrices. (also seems like only smart-seq2 processings there, no 10x). That's why I wanted to ask here to see if you've made changes in recent updates for the scripts.

One more dump q - did you trim the 10x data as well? I think it's just TSO (?) cz in the paper you said you directly use outputs from cellranger... and cellranger v4.0+ clipped TSO and polyA/T maybe @tegowski know it better? thank you!

tegowski commented 1 year ago

Hello,

I can also help try and figure out why you got empty matrices from the 10x data, but not smart-seq. I could help if you send me your script used to call parseBAM.pl and maybe a screenshot of

But as far as the other question, you are correct. I used cellranger 3.1, but it trims TSO and polyA, but after the cellranger alignment with STAR, I did no further trimming prior to using the bamfiles for Bullseye.

Let me know if you have any other questions! Matt

Sent from Mailhttps://go.microsoft.com/fwlink/?LinkId=550986 for Windows

From: @.> Sent: Monday, November 21, 2022 5:43 PM To: @.> Cc: Matt Tegowski, @.>; @.> Subject: Re: [mflamand/Bullseye] empty output from parseBAM.pl (Issue #8)

Hi @mflamandhttps://urldefense.com/v3/__https:/github.com/mflamand__;!!OToaGQ!vEWWuO9OoV4xDR0yBCvM9GRvrgXDrPhD8g0uKJLc1r7u_3UFTvvbjip0cmjJGr8TGUs9GWxs_dczrOVg_kkXbxvjlkna14G5$ thanks for the reply! I was actually following that repo and running the perl scripts and got empty matrices. (also seems like only smart-seq2 processings there, no 10x). That's why I wanted to ask here to see if you've made changes in recent updates for the scripts.

One more dump q - did you trim the 10x data as well? I think it's just TSO (?) cz in the paper you said you directly use outputs from cellranger... and cellranger v4.0+ clipped TSO and polyA/T maybe @tegowskihttps://urldefense.com/v3/__https:/github.com/tegowski__;!!OToaGQ!vEWWuO9OoV4xDR0yBCvM9GRvrgXDrPhD8g0uKJLc1r7u_3UFTvvbjip0cmjJGr8TGUs9GWxs_dczrOVg_kkXbxvjltiPCxAW$ know it better? thank you!

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/mflamand/Bullseye/issues/8*issuecomment-1322703469__;Iw!!OToaGQ!vEWWuO9OoV4xDR0yBCvM9GRvrgXDrPhD8g0uKJLc1r7u_3UFTvvbjip0cmjJGr8TGUs9GWxs_dczrOVg_kkXbxvjlsHG5WYW$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AP66MBMUYD7PXGMKZRH5XZDWJP3INANCNFSM6AAAAAASG72JFI__;!!OToaGQ!vEWWuO9OoV4xDR0yBCvM9GRvrgXDrPhD8g0uKJLc1r7u_3UFTvvbjip0cmjJGr8TGUs9GWxs_dczrOVg_kkXbxvjlslwUvuJ$. You are receiving this because you were mentioned.Message ID: @.***>

chilampoon commented 1 year ago

Thanks @tegowski - I actually found quite many tso or partial tso in the reads, I am gonna trim tso in R2 fastq and align them using HISAT3N and then try to run your perl scripts again to see what will happen

tegowski commented 1 year ago

Did you use Hisat to align previously? And if so, did you check that the barcodes are in the CB:Z: field of the bamfile? That could be one reason its coming out blank is if the barcodes are not there or in a different field Bullseye may have no output.

Matt

Sent from Mailhttps://go.microsoft.com/fwlink/?LinkId=550986 for Windows

From: @.> Sent: Tuesday, November 22, 2022 1:53 PM To: @.> Cc: Matt Tegowski, @.>; @.> Subject: Re: [mflamand/Bullseye] empty output from parseBAM.pl (Issue #8)

Thanks @tegowskihttps://urldefense.com/v3/__https:/github.com/tegowski__;!!OToaGQ!rvvgfu1h0N6A2xEmADlBcJ4EHxmBp3M1PS5SOplXDshe7sOcq0BchHdXnW9ZxJLOp3fFVz66Dx9ACgS8yV9KYYmUgDRL5GDq$ - I actually found quite many tso or partial tso in the reads, I am gonna trim tso in R2 fastq and align them using HISAT3N and then try to run your perl scripts again to see what will happen

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/mflamand/Bullseye/issues/8*issuecomment-1324112411__;Iw!!OToaGQ!rvvgfu1h0N6A2xEmADlBcJ4EHxmBp3M1PS5SOplXDshe7sOcq0BchHdXnW9ZxJLOp3fFVz66Dx9ACgS8yV9KYYmUgLtcix_p$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AP66MBMKVMKE3C4I4WP7U2DWJUJD5ANCNFSM6AAAAAASG72JFI__;!!OToaGQ!rvvgfu1h0N6A2xEmADlBcJ4EHxmBp3M1PS5SOplXDshe7sOcq0BchHdXnW9ZxJLOp3fFVz66Dx9ACgS8yV9KYYmUgNIkxjTw$. You are receiving this because you were mentioned.Message ID: @.***>

chilampoon commented 1 year ago

Did you use Hisat to align previously? And if so, did you check that the barcodes are in the CB:Z: field of the bamfile? That could be one reason its coming out blank is if the barcodes are not there or in a different field Bullseye may have no output. Matt Sent from Mailhttps://go.microsoft.com/fwlink/?LinkId=550986 for Windows From: @.***> Sent: Tuesday,

I tried cellranger bams and minimap2 bams and both got empty matix output.. I think barcode tags are in cellranger bams. But I am aware of adding barcode & umi tags back to the alignment outputs

chilampoon commented 1 year ago

Oh actually the barcode tag name is CB not CB:Z in my cellranger bams, is that an issue?

tegowski commented 1 year ago

I think that is likely to be an issue if you are using the “10x” option for the barcode because it is specifically looking for the “CB:Z:” pattern. But there is an option to input your own pattern as REGEX instead of using the 10x or SMART options. You should be able to input “CB:” instead

Matt

Sent from Mailhttps://go.microsoft.com/fwlink/?LinkId=550986 for Windows

From: @.> Sent: Tuesday, November 22, 2022 2:46 PM To: @.> Cc: Matt Tegowski, @.>; @.> Subject: Re: [mflamand/Bullseye] empty output from parseBAM.pl (Issue #8)

Oh actually the barcode tag name is CB not CB:Z in my cellranger bams, is that an issue?

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/mflamand/Bullseye/issues/8*issuecomment-1324163522__;Iw!!OToaGQ!uyP0vdGr74mEfAZalwMYf-yGTx6qhBIbzY0X9BIoPZV2fWswLX-9CqgcIMAJAF5OEiikcILDTmzeRH1GV9sRtCuRx4R5NTUK$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AP66MBO5IJSRI2VVXUP3S7TWJUPJBANCNFSM6AAAAAASG72JFI__;!!OToaGQ!uyP0vdGr74mEfAZalwMYf-yGTx6qhBIbzY0X9BIoPZV2fWswLX-9CqgcIMAJAF5OEiikcILDTmzeRH1GV9sRtCuRx9RMxAPP$. You are receiving this because you were mentioned.Message ID: @.***>

mflamand commented 1 year ago

Hi,

as Matt said, you can indicate the actual sam tag which correspond to the cell ID instead of the 10x (which is hardcoded for CB:Z:) or SMART (which is hardcoded for RG:Z:).

I could modify the script so it also recognize "CB:" as well if that is the new output of cellranger. If you would prefer this, could you please 10 or so lines of the bam file so I can test it work properly?

Best,

chilampoon commented 1 year ago

Sure. I just noticed that it's not CB: but still CB:Z: in cellranger bam, however tags after pysam processing it'll be like ('CB', 'AGTGCCGCAAGACGGT-1'), that's why I was saying the tag name seemed different yesterday. So maybe there're some other issues that caused empty matrices...

Attaching some of the alignments:

A00814:412:H37KTDSX2:1:2218:29026:29966 16  chr1    3019471 0   75M15S  *   0   0   CTGGGCTGGAATTTGTGTTCTCTTAGTGTCTGTATAACATCTGTCCAGGCTCTTCTGGCTTTCATAGTCTCTGCGTTGATACCACTGCTT  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFF  NH:i:6  HI:i:5  AS:i:73 nM:i:0  RG:Z:control_2:0:1:H37KTDSX2:1  RE:A:I  xf:i:0  CR:Z:ACCTGAATCTCACCCA   CY:Z:FFFFFFFFFFFFFFFF   CB:Z:ACCTGAATCTCACCCA-1 UR:Z:ATTCGCTTCCCG   UY:Z:FFFFFFFFFFFF   UB:Z:ATTCGCTTCCCG
A00814:412:H37KTDSX2:1:1346:11559:36886 0   chr1    3040263 255 4S86M   *   0   0   GTTGCAACATTAAGAATGGAAAGCCAACATTCACGTGGAAACTGAACAACACTCTTCTCAATGATACCTTGGTCAAGGAAGGAATAAAGA  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFF  NH:i:1  HI:i:1  AS:i:78 nM:i:3  RG:Z:control_2:0:1:H37KTDSX2:1  RE:A:I  xf:i:0  CR:Z:TCAGTTTTCCATGAGT   CY:Z:FFFFFFFFFFFFFFFF   CB:Z:TCAGTTTTCCATGAGT-1 UR:Z:AGCCAATTCGAC   UY:Z:FFFF:FFFFFFF   UB:Z:AGCCAATTCGAC
A00814:412:H37KTDSX2:1:1371:32036:17644 16  chr1    3043192 3   4S86M   *   0   0   AGTTATGTTTACAGTATATCAGGGAGATCTACACAATGGAGTACTACTCAGCTATTAAAAAGAATGAATTTATGAAATTCCTAGCCAAAT  :,,,,F,,,,,,,F,,,:,,,,,,F,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFF  NH:i:2  HI:i:1  AS:i:70 nM:i:7  RG:Z:control_2:0:1:H37KTDSX2:1  RE:A:I  xf:i:0  CR:Z:CCGGGTATCGCTGATA   CY:Z:FFFFFFFFFFFFFFFF   CB:Z:CCGGGTATCGCTGATA-1 UR:Z:TTTTGCATGGGT   UY:Z:FFFFFFFFFFFF   UB:Z:TTTTGCATGGGT
A00814:412:H37KTDSX2:1:1120:10312:8390  0   chr1    3050822 255 90M *   0   0   ATCAGATCCCATATAGATGATTGTGAGCCACCACATGGGTGCTGGGAATTGAACTCAGAATCTCTGAAAGAGCAGCCAGTGTTCTTAACC  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF  NH:i:1  HI:i:1  AS:i:88 nM:i:0  RG:Z:control_2:0:1:H37KTDSX2:1  RE:A:I  xf:i:0  CR:Z:ATGGAGGGTGTGTGTG   CY:Z:FFFFFFFFFFFFFFFF   CB:Z:ATGGAGGGTGTGTTTG-1 UR:Z:TGTGTGTGTGTG   UY:Z:FFFFFFFFFFFF   UB:Z:TGTGTGTGTGTG
A00814:412:H37KTDSX2:1:2248:7952:18490  16  chr1    3063421 255 87M3S   *   0   0   GATCTCATTATTGGTAGTTGTGAGCTACCATGTGGTTGCTTGGTTTTGAACTGAGGACATTTGGATGAGCAGTCGGATGCTCTTTCCGTG  FFF:FF,FFF,,FF:FFF,FFF,F,F:F,FFFFFFF:FFF,::,FFFFF,FF,F:FF,,,,FFFF,,FF,,:F,FF,:,FF,,:,F,,F,  NH:i:1  HI:i:1  AS:i:69 nM:i:8  RG:Z:control_2:0:1:H37KTDSX2:1  RE:A:I  xf:i:0  CR:Z:GGGCTACTCCGACATA   CY:Z:FFF::FFFFFFFFFFF   CB:Z:GGGCTACTCCGACATA-1 UR:Z:CAGCAGAAATGA   UY:Z:FFFFFFFF:FFF   UB:Z:CAGCAGAAATGA
A00814:412:H37KTDSX2:2:2638:22598:22467 16  chr1    3063470 255 60M30S  *   0   0   ACTCAGGACCTTCGGAAGAGCAGTCGGATGCTCTTACCCACTGAGCCATTTCACCAGCCCCCCATGTACTCTGCGTTGATACCACTGCTT  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFF  NH:i:1  HI:i:1  AS:i:59 nM:i:0  ts:i:30 RG:Z:control_2:0:1:H37KTDSX2:2  RE:A:I  xf:i:0  CR:Z:TCGCTCAGTGGCTAGA   CY:Z:FFFFFFFFFFFFFFFF   CB:Z:TCGCTCAGTGGCTAGA-1 UR:Z:GATTTTTGGTTT   UY:Z:FFFFFFF:F,FF   UB:Z:GATTTTTGGTTT
A00814:412:H37KTDSX2:2:2638:23375:24064 16  chr1    3063470 255 60M30S  *   0   0   ACTCAGGACCTTCGGAAGAGCAGTCGGATGCTCTTACCCACTGAGCCATTTCACCAGCCCCCCATGTACTCTGCGTTGATACCACTGCTT  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF  NH:i:1  HI:i:1  AS:i:59 nM:i:0  ts:i:30 RG:Z:control_2:0:1:H37KTDSX2:2  RE:A:I  xf:i:0  CR:Z:TCGCTCAGTGGCTAGA   CY:Z:FFFFFFFFFFFF:FFF   CB:Z:TCGCTCAGTGGCTAGA-1 UR:Z:GATTTTTGGTTT   UY:Z:FF,FFFF,FF:F   UB:Z:GATTTTTGGTTT
A00814:412:H37KTDSX2:1:1547:6008:14215  16  chr1    3069555 3   90M *   0   0   TAGTGTCTGTATAACATCTATCCAGGCTCTTCTGGCTTTCATAGTCTCTGGTGAAAAGTCTGGTGTAATTCTGATAGGCCTGCCTTTATA  FFF,FFFFFF:FFFFFFFFFFFFFFFFFF:FF:FFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF  NH:i:2  HI:i:1  AS:i:88 nM:i:0  RG:Z:control_2:0:1:H37KTDSX2:1  RE:A:I  xf:i:0  CR:Z:TACAACGCATGTTTGG   CY:Z:FFFFFFFFFFFF,FFF   CB:Z:TACAACGCATGTTTGG-1 UR:Z:TTCTTGTAAATG   UY:Z:FFFFFFFFFFFF   UB:Z:TTCTTGTAAATG
A00814:412:H37KTDSX2:1:1536:5810:23296  0   chr1    3085672 0   30S60M  *   0   0   AAGCAGTGGTATCAACGCAGAGTACATGGGCCCAACAATCAAAGAAAATGCAAAATGCAAAAAGATCCTAACTCAAAACATCCAGGAAAT  FFFFFFF:FFFFFFFFFFFFFFFFF,FFFFF,FFFF:F:FFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFF:FFFFFFFFFFFFFF,FF  NH:i:5  HI:i:1  AS:i:59 nM:i:0  ts:i:30 RG:Z:control_2:0:1:H37KTDSX2:1  RE:A:I  xf:i:0  CR:Z:TGTAACGCAGTTCACA   CY:Z:FFFFFFFFFF,FF:FF   CB:Z:TGTAACGCAGTTCACA-1 UR:Z:TATGCGTCTCTT   UY:Z:FFF::,FF:F::   UB:Z:TATGCGTCTCTT
A00814:412:H37KTDSX2:2:2449:17671:35947 16  chr1    3136662 0   90M *   0   0   CACCCCCACTCCCCTGCCCACCCACTCCCCCTTTTTGGCCCTGGTGTTCCCCTGTACTGGGGCTTATAAAGTTTGCAAGTCCAATGGGCC  FFFF,FF:F:FFFFFF:FFFF:F,FFFFF::FF:FFF::FFF:FFF:F,,FFFFFF:FF:FF,,FFFF,F:FFFFFF,FFFFFF:FFF,,  NH:i:7  HI:i:1  AS:i:86 nM:i:1  RG:Z:control_2:0:1:H37KTDSX2:2  RE:A:I  xf:i:0  CR:Z:CAGGTATGTCAAGGCA   CY:Z:FFFFFFFFFFFFFFFF   CB:Z:CAGGTATGTCAAGGCA-1 UR:Z:CAAGAGCTATTA   UY:Z:FFFFFFFFFFFF   UB:Z:CAAGAGCTATTA
A00814:412:H37KTDSX2:1:2410:19298:14559 0   chr1    3142729 0   30S60M  *   0   0   AAGCAGTGGTATCAACGCAGAGTACATGGGACAGCATCCTAAATAACAAAATCAGAAATGAAAAGGGAGACATAACAACAGATCCTGAAG  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF  NH:i:8  HI:i:1  AS:i:59 nM:i:0  ts:i:30 RG:Z:control_2:0:1:H37KTDSX2:1  RE:A:I  xf:i:0  CR:Z:ACGTAACCACCCTTGT   CY:Z:FFFFFFFFFFFFFFFF   CB:Z:ACGTAACCACCCTTGT-1 UR:Z:TCTGGCTTCATC   UY:Z:FFFFFFFFFFFF   UB:Z:TCTGGCTTCATC
A00814:412:H37KTDSX2:2:2131:21151:7435  0   chr1    3142729 0   30S60M  *   0   0   AAGCAGTGGTATCAACGCAGAGTACATGGGACAGCATCCTAAATAACAAAATCAGAAATGAAAAGGGAGACATAACAACAGATCCTGAAG  FFFFFFFF,FFFFFFFF:FFFF:FFFF:F:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFF:FFF  NH:i:8  HI:i:1  AS:i:59 nM:i:0  ts:i:30 RG:Z:control_2:0:1:H37KTDSX2:2  RE:A:I  xf:i:0  CR:Z:GTTACAGCAACCTAAC   CY:Z:F,FF,FFFFFFFFFFF   CB:Z:GTTACAGCAACCTAAC-1 UR:Z:CCTAGCCGACTC   UY:Z:FF:FF:FFF:FF   UB:Z:CCTAGCCGACTC
A00814:412:H37KTDSX2:1:2410:19533:14779 0   chr1    3142729 0   30S60M  *   0   0   AAGCAGTGGTATCAACGCAGAGTACATGGGACAGCATCCTAAATAACAAAATCAGAAATGAAAAGGGAGACATAACAACAGATCCTGAAG  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF  NH:i:8  HI:i:1  AS:i:59 nM:i:0  ts:i:30 RG:Z:control_2:0:1:H37KTDSX2:1  RE:A:I  xf:i:0  CR:Z:ACGTAACCACCCTTGT   CY:Z:FFFFFFFFFFFFFFFF   CB:Z:ACGTAACCACCCTTGT-1 UR:Z:TCTGGCTTCATC   UY:Z:FFFFFFFFFFFF   UB:Z:TCTGGCTTCATC
A00814:412:H37KTDSX2:1:2608:6614:7968   0   chr1    3154927 0   90M *   0   0   CCAGCTCCTATTTTAGCCACAAATCGTGGTGTTACTAATGACATAATTCTTGCCTAGGTCTTGCTAAATCTGAGGTTGATAATTCTCCTT  FF:FFFFF,F,FFFFFFFFFFFFFFFFF:FF:FFFFFFFFFFFFFF:F:FFFFFFFF,FFFFFFFFFF,F:FFF,FFFFFFFFFFFFFFF  NH:i:5  HI:i:1  AS:i:88 nM:i:0  RG:Z:control_2:0:1:H37KTDSX2:1  RE:A:I  xf:i:0  CR:Z:CACGTTCAGTTTCAGC   CY:Z:FFFF:FFFFFFFFF,,   CB:Z:CACGTTCAGTTTCAGC-1 UR:Z:CTATGTACAACC   UY:Z:::FF,FFFFFFF   UB:Z:CTATGTACAACC
A00814:412:H37KTDSX2:1:1676:30653:8547  0   chr1    3156001 255 63M27S  *   0   0   TCGGGTCCCCTCCCCCTTCCTTCATAACTAGTGTCGCAACAATAAAATTTGAGCCTTGATCCGAAAAAAAAAAAAAAAAAAAAAAAAAAA  FFFFFFFFFFF::FFFFF:FFFFF,FFF,F:F::FFFFF:FFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF  NH:i:1  HI:i:1  AS:i:62 nM:i:0  pa:i:27 RG:Z:control_2:0:1:H37KTDSX2:1  RE:A:I  xf:i:0  CR:Z:CTATCTAAGCCTATCA   CY:Z:FFFFFFFFF::F,FFF   CB:Z:CTATCTAAGCCTATCA-1 UR:Z:CCTCACCGATCG   UY:Z:FFFFFFFFF,FF   UB:Z:CCTCACCGATCG
A00814:412:H37KTDSX2:1:2675:31593:35634 0   chr1    3156001 255 63M27S  *   0   0   TCGGGTCCCCTCCCCCTTCCTTCATAACTAGTGTCGCAACAATAAAATTTGAGCCTTGATCCGAAAAAAAAAAAAAAAAAAAAAAAAAAA  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF  NH:i:1  HI:i:1  AS:i:62 nM:i:0  pa:i:27 RG:Z:control_2:0:1:H37KTDSX2:1  RE:A:I  xf:i:0  CR:Z:CTATCTAAGCCTATCA   CY:Z:FFFFFFFFFF:F,FFF   CB:Z:CTATCTAAGCCTATCA-1 UR:Z:CCTCACCGATCG   UY:Z:FFFFFFFFFFFF   UB:Z:CCTCACCGATCG
A00814:412:H37KTDSX2:2:2625:17463:10786 0   chr1    3156408 1   21S69M  *   0   0   TATCAACGCAGAGTACATGGGATGGAAGAGAGAATCTCAGGTGCAGAAGATTCCATAGAGAACATCGGCACAACAATCAAAAAAAAAAAA  FFFFFFFFFFFFFFFFFFFFFFFFFFF:F:FFFFFFFFF:FFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF  NH:i:4  HI:i:1  AS:i:67 nM:i:0  ts:i:21 RG:Z:control_2:0:1:H37KTDSX2:2  RE:A:I  xf:i:0  CR:Z:GTTTACTGTGGCAACA   CY:Z:FFFFFFFFFFFFF,FF   CB:Z:GTTTACTGTGGCAACA-1 UR:Z:GCAACTATAAAT   UY:Z:FFFFFFFF:FFF   UB:Z:GCAACTATAAAT
A00814:412:H37KTDSX2:1:1666:16712:7764  16  chr1    3163691 1   42S46M2S    *   0   0   TTTTTTTTTTTTTTTTTGTGTTTTTTTTTATTTTATTTTTTTGGGTTTTTGTTTTGTTTTATTGGTTTTTTTATTTTTTTTTTTTTTTCG  F,,FF:FF:FF:::F,:,F,::,:FFFF:,FFFF,:FFF,,FF,:,,F,:,:,,,F,F,,,,,,,FF,:F:,,FF:,F:,,::F,F,F,F  NH:i:3  HI:i:3  AS:i:33 nM:i:6  pa:i:42 RG:Z:control_2:0:1:H37KTDSX2:1  RE:A:I  xf:i:0  CR:Z:ATTCCCGCACAGAGCA   CY:Z:FFFFFF:FFFF:FFFF   CB:Z:ATTCCCGCACAGAGCA-1 UR:Z:AGATCATGTATA   UY:Z:,FF:F:FFFFFF   UB:Z:AGATCATGTATA
A00814:412:H37KTDSX2:2:2317:29532:19476 0   chr1    3168529 255 31S47M12S   *   0   0   AAGCAGTGGTATCAACGCAGAGTACATGGGAGCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGGCCCGGGCCCGAGGG  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:F:FFF:FFFF,:,,,,FF,,,,F,,,:FF  NH:i:1  HI:i:1  AS:i:46 nM:i:0  ts:i:30 RG:Z:control_2:0:1:H37KTDSX2:2  RE:A:I  xf:i:0  CR:Z:TTGATGGTCCCTAGGG   CY:Z:FFFFFFFFFFFFFFFF   CB:Z:TTGATGGTCCCTAGGG-1 UR:Z:TAACGTAACTTG   UY:Z:FFF:FFFFFFFF   UB:Z:TAACGTAACTTG
A00814:412:H37KTDSX2:1:1144:12689:27946 0   chr1    3168533 1   30S46M14S   *   0   0   AAGCAGTGGTATCAACGCAGAGTACATGGGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGGCGTTGTTAAAATAAAAATA  FFFFF:FF:FFFFFFFFFF,FFFFFF:FF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,,F:,FF:,,,,:F,,FF,F:,FF,,F,F  NH:i:4  HI:i:1  AS:i:43 nM:i:1  ts:i:30 RG:Z:control_2:0:1:H37KTDSX2:1  RE:A:I  xf:i:0  CR:Z:ATGAGTCAGATACCAA   CY:Z:FFFFFF,FFFFFFF:F   CB:Z:ATGAGTCAGATACCAA-1 UR:Z:CGCCTTTCAGAT   UY:Z:FFFF::F::,F,   UB:Z:CGCCTTTCAGAT
mflamand commented 1 year ago

Hi,

thanks for sending this out.

Indeed the CB:Z: should be there to indicate the CB field is a string. Without the Z, the bam file would not be read.

I think you are right and I had inadvertently broke something when I recently added the optional strand detection. I have pushed a new update which should fix the problem (at least it now works on my side).

Please let me know if you still have issues.

Best,