seqan / seqan3

The modern C++ library for sequence analysis. Contains version 3 of the library and API docs.
https://www.seqan.de
Other
396 stars 81 forks source link

Unsupported sam headers get silently stripped #3251

Open notestaff opened 4 months ago

notestaff commented 4 months ago

Does this problem persist on the current main?

Is there an existing issue for this?

Current Behavior

Currently, unknown tags in SAM headers are getting silently stripped from files.

Expected Behavior

Given that the SAM format permits user-defined tags, it would be better to pass them unchanged from input to output.

Steps To Reproduce

Try with the pb: tag mentioned in https://github.com/seqan/seqan3/issues/3215

Environment

SeqAn version: 3.3.0
Operating system: Linux bpb23-acc 5.4.0-136-generic #153-Ubuntu SMP Thu Nov 24 15:56:58 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Compiler: x86_64-conda-linux-gnu-c++ (conda-forge gcc 12.3.0-2) 12.3.0

Anything else?

No response

notestaff commented 4 months ago

@eseiler

notestaff commented 4 months ago

P.S. Is there any workaround in the current seqan3 release, that would allow preserving pb: header tags? We're writing a tool that adds tags to bam records, and it should make no other change to the bam file, but right now it also ends up stripping pb: header tags. Thanks for help! @eseiler