Linked-read barcode information is usually present in the header in various formats.
Three cases I have seen:
@V10002828L1C001R013000000#543_288_92/1 1 with barcode placed between # and /@V10002828L1C001R013000000_1 BX:Z:543_288_92 with barcode placed after signs BX:Z:@A00428:24:H5327DSXX:2:1101:1253:1000 1:N:0:CTGTAACT with barcode placed after signs 1:N:0: (may be more complex, like the 0 may be any number? look further into the example file)
The current btllib code supports the first and the second one, but not the last one (seen in 10x data from T2T). Would be nice to have this supported.
an example dataset with that last format
$ pigz -dc /projects/btl/datasets/hsapiens/CHM13/T2T/10x/CHM13_interleved_all.fq.gz | head -n1
@A00428:24:H5327DSXX:2:1101:1253:1000 1:N:0:CTGTAACT
Linked-read barcode information is usually present in the header in various formats. Three cases I have seen:
@V10002828L1C001R013000000#543_288_92/1 1
with barcode placed between#
and/
@V10002828L1C001R013000000_1 BX:Z:543_288_92
with barcode placed after signsBX:Z:
@A00428:24:H5327DSXX:2:1101:1253:1000 1:N:0:CTGTAACT
with barcode placed after signs1:N:0:
(may be more complex, like the 0 may be any number? look further into the example file)The current btllib code supports the first and the second one, but not the last one (seen in 10x data from T2T). Would be nice to have this supported.
an example dataset with that last format