Closed niclaurenti closed 1 year ago
Not an actual solution, more a workaround...
I'm not sure we want to leave it like this, I see three options:
pinecards.toml
, writing the version number (and possibly update it in a workflow, while releasing pinecards, and otherwise setting it to dirty
or something similar)Giving up on runcards version is close enough to giving up on grids reproducibility. The only other chance is to include a full dump of the compressed content of the whole runcard folder...+
In any case, I would ask for @cschwan opinion on this
Mmm agreed that this solution is not optimal, but even the old version is not
the problem is that we decided to split program and runcards and I think we want to stick to that, no?
raise a custom error, requiring the user to run inside a Git repository
that is quite counter intuitive and as said above I can generate whatever (which is not helpful either)
add a metadata file to the pinecards folder, like pinecards.toml, writing the version number (and possibly update it in a workflow, while releasing pinecards, and otherwise setting it to dirty or something similar)
most likely you will not have the full set around and so a separate file is as radom as a repo
ask the user to manually provide version number
we don't want to have interactivity in the game, do we?
The only other chance is to include a full dump of the compressed content of the whole runcard folder...
if you want to ensure reproducibility I think this might be the only option; dumping everything might be heavy though ... as a slight variant we might just print the md5hash of each file involved ...
we don't want to have interactivity in the game, do we?
Nope, but we'd allow for configurations and CLI arguments.
most likely you will not have the full set around and so a separate file is as radom as a repo
No again: usually runcards come with a version, so if you download you would download all of them, with the metadata provided and updated. If you take the burden of grabbing only a few of them, then you should bring together the config file.
The other option is to spread the version number in all runcards metadata, but this is redundant and prone to generate inconsistencies.
the problem is that we decided to split program and runcards and I think we want to stick to that, no?
Even by splitting, runcards will stay in any case in another repo, and they will be versioned. Being part of the same or different repo makes no conceptual difference (and no change to the code you just need to point root
in the configs to the runcards repo, or we switch the option use there to the runcards path).
if you want to ensure reproducibility I think this might be the only option; dumping everything might be heavy though ... as a slight variant we might just print the md5hash of each file involved ...
Any hash can't ensure reproducibility, because they are not a bijective function, unless you have a repo or any other kind of conventional place where you can use the hash to retrieve the object.
Versions are working as hashes (and git commits in particular are hashes), so your proposal is contradictory on its own: you don't want to enforce a repo, but you propose the only thing that really requires a repo.
Close in favor of https://github.com/NNPDF/pinefarm/pull/16
I open this PR in order to fix the bug in line 114 of
pinefarm/external/interface.py
related to the fact that when we callpinefarm run <RUNCARD> <THEORY>
outside a git repo (for example if we installedpinefarm
viapip install 'pineline[full]'
) we get an error.