Open mr-eyes opened 2 weeks ago
@luizirber suggested a generic metadata field, but notes "The danger of generic metadata with key/val is that you can NEVER depend on the value actually being there, so any code using those values need to account for that"
@ccbaumler additionally wants info on whether a sketch is a pangenome sketch @bluegenes wants to add information on whether or not a sketch is a translated sketch
concern: metadata mayget out of date. could tie metadata to md5hash of original data file or something?
plugins for manipulating metadata would be great!
related issues:
I think one place we talked about mechanisms for this that were unrelated to modifying Signature was here: https://github.com/sourmash-bio/sourmash/issues/2180
I would like to also mention https://github.com/sourmash-bio/sourmash/issues/2985 here.
Pangenome-related metadata could also be the count of genomes that have been compressed into the pangenome. This is an important metric to define the reliability of the pangenome element characterization.
@luizirber suggests that the metadata field could be open to users to modify, not used internally in sourmash. If we want to store and use a field internally, we can create actual individual fields (so each is guaranteed to have an entry w/specific meaning).
In snipe, I wanted to keep the number of sketched bases to assess the sketching efficiency. However, there is no place in the sourmash signature JSON to hold this information so I had to add a custom suffix to the signature name. It would be great if we can add a metadata dict to the sourmash signature.