Closed notestaff closed 8 months ago
Hey @notestaff,
once again, thank you for finding flaws in our documentation!
It should definitely not say that it throws. It's probably a leftover from when we amended the alignment file parsing to be a bit less strict.
Do you have an example SAM/BAM file? Just the header section and one record/entry should be enough. We can also just add the pb
header to a file ourselves, but it's better to have an example of what you are using.
In general, pb
is not part of the SAM-specification's @HD
tag, so we emit a warning.
What would be a solution you are looking for? It would be possible to redirect std:cerr
while parsing the file, such that the warning is not emitted.
Some option to ignore/warn/error would also be possible, but would require code restructuring.
The first header line reads
@HD VN:1.6 SO:unknown pb:5.0.0
What would be useful is an option to suppress warnings about unsupported SAM header tag. Simplest might be to make this a compilation option, e.g. if SEQAN3_SUPPRESS_UNSUPPORTED_HEADER_TAG_WARNING is defined to 1 then don't print the warning. Or this could be part of the configuration options when opening a SAM file. Best would of course be to have a way to specify additional recognized tags. Thanks!
It's now possible to ignore the warnings: https://docs.seqan.de/seqan3/main_user/structseqan3_1_1sam__file__input__options.html#a0ddbdbedd6183dc583d9195917a6ef26
Great, thanks a lot! Just in time for external release of our seqan3-based tool :)
But, this fix suppresses ALL warnings -- a way to suppress specific ones would be better.
Also, the SAM spec says "you can freely add new tags for further data fields. Tags containing lowercase letters are reserved for local use". So I'm not sure this should be a warning at all, at least by default, especially for lowercase tags like in our example.
But, this fix suppresses ALL warnings -- a way to suppress specific ones would be better.
It's currently the only warning we emit ("Unsupported SAM header tag in [...]").
It's also a bit more flexible than just on/off in that you can postprocess the warning, e.g.,
void filter()
{
auto fin = get_sam_file_input();
std::ostringstream stream{};
fin.options.stream_warnings_to = std::addressof(stream);
auto it = fin.begin();
for (auto && warning : stream.view() | std::views::split('\n'))
{
if (std::string_view sv{warning.data(), warning.size()}; !sv.empty() && !sv.ends_with("pb"))
std::cerr << sv << '\n';
}
}
Would be nicer to overload operator<<
, but then we would need either some customisation point object...
Also, the SAM spec says "you can freely add new tags for further data fields. Tags containing lowercase letters are reserved for local use". So I'm not sure this should be a warning at all, at least by default, especially for lowercase tags like in our example.
I did miss this one sentence in the spec multiple times :) Before this was a warning, it was an exception. I'm not sure if the warning is meant as "it's not valid, but we just continue" or "we don't support that" (in contrast to alignment tags).
Platform
Linux bpb23-acc 5.4.0-136-generic #153-Ubuntu SMP Thu Nov 24 15:56:58 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Question
Is there a way to suppress the message
Unsupported SAM header tag in @HD: pb
when reading PacBio bam files? Relatedly, the documentation says https://github.com/seqan/seqan3/blob/57b9924a93b74a44319f8f9e93fa4292559c91aa/include/seqan3/io/sam_file/detail/format_sam_base.hpp#L272, but that's no longer the case, right? (And the current behavior of not throwing an error seems correct.)