scikit-hep / pyhf

pure-Python HistFactory implementation with tensors and autodiff
https://pyhf.readthedocs.io/
Apache License 2.0
284 stars 84 forks source link

encode pyhf version in digests #1129

Open lukasheinrich opened 4 years ago

lukasheinrich commented 4 years ago

Description

the digests we compute depend on the pyhf version, so perhaps we should encode it somehow into the digest string. sha256:0.5.2:afd98dsfa1cd0593... ?

matthewfeickert commented 4 years ago

Does the pyhf version matter? Shouldn't the digest only depend on the schema version?

lukasheinrich commented 4 years ago

it should but in reality the "way" we compute the digests depends no the pyhf version. If we find a bug and change it later, it's hart to verify the digest w/o knowing what s/w was used to compute it

matthewfeickert commented 4 years ago

it should but in reality the "way" we compute the digests depends on the pyhf version.

Ah okay. Then agreed we should either do this or change the way the digest is computed.

alexander-held commented 4 years ago

Does it only depend on the __version__ or also on the exact commit when not using a tagged version?

lukasheinrich commented 4 years ago

i mean as @matthewfeickert says I hope the digest value never really starts depending a lot on the version. But @kratsg recently worked on stuff that had the potential to touch the digests and I was wondering how we'd ever find out how to recompute a given digest if we don't know which version of pyhf it was computed with

kratsg commented 4 years ago

i mean as @matthewfeickert says I hope the digest value never really starts depending a lot on the version. But @kratsg recently worked on stuff that had the potential to touch the digests and I was wondering how we'd ever find out how to recompute a given digest if we don't know which version of pyhf it was computed with

Ahh, so in the case -- we were changing the workspace that came out. We can generally keep things backwards compatible by adding to the list of algorithms supported... such as sha256versioned and md5versioned and so on... so at least this allows us to maintain backwards-compatibility. The fundamental digest algorithm calculation for a workspace shouldn't change regardless of how the workspace is fed in... I think we can simply be strict about this piece of it.