NNPDF / pinecards

Runcards needed to generate PineAPPL grids for NNPDF processes
3 stars 1 forks source link

Support config files #127

Closed alecandido closed 2 years ago

alecandido commented 2 years ago

This should close NNPDF/runcards#125 and close NNPDF/runcards#108.

At the moment, it is impossible to install runcardsrunner as a pip package, because it's pointing to paths relative to its source for assets. This PR introduces configuration files in order to make it possible.

In order of priority it should read from:

I'm skipping environment variables for the time being, not to add further complications.

I proposed some time ago to have a unique config files for all our programs, divided in sections, see https://github.com/NNPDF/fktables/issues/4. Currently, this is almost the only one that needs configurations, and it's the one for which are most urgently needed. So I'll name it with a stupid but not-so-ambiguous name (not because it's clever, but simply because it's long), i.e. runcardsrunner.toml, hopefully will be easy enough in future to change this name.

I'm using toml because it's getting traction in the python community (consider pyproject.toml), and I chose the same library vendored by pip (namely tomli), that it has been currently proposed for addition into standard library, see PEP 680.

alecandido commented 2 years ago

This one we can merge for me, I tested, and it's still working.

Config files are optional, in principle you can also do without, and there are sensible defaults for everything. rr is looking for configs files in the order spelled out in the top comment of this PR.

The only thing for which I cannot add a universally good default is the root, that, if not specified in any other way, it is assumed to be the current folder. At the moment, if a config file is found, the root is assumed to be the folder of the config file, but this is not a very clever choice (since it might be even ~ or ~/.config).

If a path is specified by the user, it is(/should be) never overwritten by the defaults. More or less all the paths are customizable by the user (except those internal to the python package).

An example configuration file, runcardsrunner.toml, is provided in the root of the repo (it keeps the usual behavior, apart for adding a result folder).

alecandido commented 2 years ago

@cschwan can you try to run?

If it's working, you can even merge :)

cschwan commented 2 years ago

It's not working (yet):

cschwan@dom runcards $ ./rr install
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/cschwan/runcards/runcardsrunner/__init__.py", line 5, in <module>
    from .cli import command
  File "/home/cschwan/runcards/runcardsrunner/cli/__init__.py", line 2, in <module>
    from . import install, list, merge, run, update
  File "/home/cschwan/runcards/runcardsrunner/cli/install.py", line 5, in <module>
    from ._base import command
  File "/home/cschwan/runcards/runcardsrunner/cli/_base.py", line 4, in <module>
    import appdirs
ModuleNotFoundError: No module named 'appdirs'
alecandido commented 2 years ago

Where did you test?

The dependency is spelled out in pyproject.toml: https://github.com/NNPDF/runcards/blob/6dcbe1419b6b4683b79f222ed41ac998479eec18/pyproject.toml#L27 so I guess you didn't update somehow...

alecandido commented 2 years ago

To be fair, it's a new dependency, to which I delegated the annoying task of taking care of conventions. We might also decide to support only user folders (I see no reason why you want to install runcardsrunner.toml system-wide), and only the conventional ~/.config, and drop the dependency.

cschwan commented 2 years ago

Where did you test?

This was a fresh install on dom, so with empty .prefix and without ~/.local.

alecandido commented 2 years ago

Okay, maybe you're still using ./rr out of the box.

Since it was confusing, and even wrong occasionally, I removed it. Indeed, if you run ./rr install, it should have printed you the help message, and did nothing.

Perhaps I should update the README.md, this is something I actually forgot to do.

However, now the installation is up to the user, but is kept very simple:

  1. you can install with poetry, then you're responsible to do poetry install by yourself: it's just one command, but leaves you full control over poetry installation (i.e. the installation of poetry itself), without trying to guess magically the best way of doing things
  2. since this PR, if you don't want to maintain poetry, you can perfectly live without, and still install from source: just use pip install .; since poetry do not yet support editable mode (i.e. pip install -e ., but it will do since next minor release, already implemented in pre-release) this is a bit uncomfortable in development, because the code is copied in the environment, and it will not run in-place; however, for development poetry is available (and I'm the main developer of the package anyhow)
  3. soon, more or less as soon as we decided a name, I'll release on PyPI (again possible because installation is now possible, since this PR), so you don't even need the source code, and the installation is pip install runcardsrunner

So you can choose: poetry install, pip install ., or pip install runcardsrunner (coming soon). For 1. then you keep going as usual, using ./rr run, for 2. and 3. you can drop two characters, and simply run rr run (installed in your environment).

alecandido commented 2 years ago

Notice that installation includes the python package alone, while for nnpdf31_proc and theories you have to provide the folders yourself. If you choose names different from default ones, you have to write them in a runcardsrunner.toml file (default ones are runcards and theories, because I decided nnpdf31_proc to be too random for a default, but it's still spelled out with the runcardsrunner.toml in this repository, so for as long as you run in this folder nothing's changed).

cschwan commented 2 years ago

I finally got some time to try it out. Please have look at my comments:

alecandido commented 2 years ago
  • why are you using CONTRIBUTING.md for other purposes than stating the contribution guidelines? I believe what you write in there should be in README.md or in another file

About "Non Python dependencies" you're perfectly right: I've just taken them out of the README.md, but that's not the proper place. Though I'd not put them back into README.md, I like more hierarchical documents than sequential ones, I have to figure out a better place. About "More on tools" they're really contribution guidelines: those tools are needed only for development, they are not user requirements, and I'd like to ensure that whoever will contribute will use them as well (since on their turn they ensure consistency of the codebase, and help in maintaining it).

  • the installation section needs content

True, but there are two obstacles that are currently preventing me to fill it:

  1. this PR itself, without it was not possible to use runcardsrunner in a non-development mode
  2. name choice, since we don't like runcardsrunner and we want to replace it

The moment these two conditions are satisfied, I can instantly make a release, and at the same time fill the "Installation" section with the actual instructions.

  • how do I use the tool? Is ./rr still relevant? This isn't clear from the documentation

./rr is not that relevant, there will be a user CLI installed with the release by pip itself, just called rr (temporary name). I can make even a .zip/.tar.gz release published on GitHub, but still you'll need pip to install, since Python code is not running in the vacuum (i.e. on bare metal, nor plain OS). At this point, better to use just PyPI, and the user will manage his own Python installation (there are tricks to package isolated Python apps, but you need an installer, that will check and install Python if needed, or to vendor the interpreter itself).

The same thing can be accessed through poetry in dev mode.

Lacking the release, ./rr is still temporary bridging the gap, calling poetry for you (but not installing it).

  • the installation of PineAPPL is broken, because github doesn't allow git://github.com/N3PDF/pineappl.git anymore. This is easily fixed by using https instead of git

This I thought to have already fixed, but maybe I did it locally and didn't push. Thanks for spotting it!

alecandido commented 2 years ago

@cschwan what do we need to merge this? Otherwise, things will get old, and it will become increasingly more complicated to merge.

If there is some to be improved, but we can live with, I'd merge and put it in a separate issue. I'd like to merge also NNPDF/runcards#132, since we started using it, but I'd rather prefer to keep them sorted.

alecandido commented 2 years ago

It remains only to refactor add_scope.

NNPDF/pinefarm#7 is already broken on master, will be fixed in a separate PR.

scarlehoff commented 2 years ago

If I try to run this in its current for I get an error, just in case I'm running this wrong:

This is what I am trying to run:

poetry install
poetry run rr run nnpdf31_proc/NNPDF_POS_DYD_40/ theories/theory_200_1.yaml

Maybe we should have the CI to test a few easy-quick grid generation to ensure the code can at least run.

felixhekhorn commented 2 years ago

@AleCandido apart from some minor problems/comments - what else do we need to close this asap?

alecandido commented 2 years ago

@AleCandido apart from some minor problems/comments - what else do we need to close this asap?

I guess to be able to run: https://github.com/NNPDF/runcards/pull/127#issuecomment-1193813261

@scarlehoff can you try to run the same with master? If it's already broken there, there is no reason to block this to fix it (at least we get config files in master, and then we start fixing everything). If instead this PR is introducing a regression, let's try to fix immediately.

alecandido commented 2 years ago

Sorry, maybe I've just been late to understand: if 3214c40cf56fb3ff60cc86095d9f42ed5272367f was the only error, so much the better.

@scarlehoff I agree about CI, please open a dedicated issue (I'd not solve in this old enough PR).

alecandido commented 2 years ago
poetry run rr run nnpdf31_proc/NNPDF_POS_DYD_40/ theories/theory_200_1.yaml

Still not able to run it, for an actual configuration related problem. I'm debugging.

The moment this will, run, I'm fine with merging.

alecandido commented 2 years ago

Problem solved in 26a4d9b, if you @scarlehoff and @felixhekhorn agree, we can merge :)

felixhekhorn commented 2 years ago

I'm getting

$ poetry run rr run nnpdf31_proc/HERA_NC_318GEV_EP_SIGMARED/ theories/theory_213.yaml 
HERA_NC_318GEV_EP_SIGMARED
Computing HERA_NC_318GEV_EP_SIGMARED...
48.0kB [00:00, 206kB/s]                                                                                    
The PDF NNPDF31_nlo_as_0118_luxqed already exists at /home/felix/local/share/LHAPDF

Installation completed
> took 0.27 s

Grid calculation completed
> took 0.00 s

Running yadism...
                                  ┌────────────────────────────────────┐                                   
                                  │                                    │                                   
                                  │ __     __       _ _                │                                   
                                  │ \ \   / /      | (_)               │                                   
                                  │  \ \_/ /_ _  __| |_ ___ _ __ ___   │                                   
                                  │   \   / _` |/ _` | / __| '_ ` _ \  │                                   
                                  │    | | (_| | (_| | \__ \ | | | | | │                                   
                                  │    |_|\__,_|\__,_|_|___/_| |_| |_| │                                   
                                  │                                    │                                   
                                  └────────────────────────────────────┘                                   

                                                   Plan                                                    

 • XSHERANC at 485 pts                                                                                     

                                                Calculation                                                
yadism took off! please stay tuned ...

took 13.34 s
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/felix/.cache/pypoetry/virtualenvs/runcardsrunner-0zbAcuub-py3.10/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/felix/.cache/pypoetry/virtualenvs/runcardsrunner-0zbAcuub-py3.10/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/felix/.cache/pypoetry/virtualenvs/runcardsrunner-0zbAcuub-py3.10/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/felix/.cache/pypoetry/virtualenvs/runcardsrunner-0zbAcuub-py3.10/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/felix/.cache/pypoetry/virtualenvs/runcardsrunner-0zbAcuub-py3.10/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/felix/Physik/N3PDF/FK/runcards/runcardsrunner/cli/run.py", line 31, in subcommand
    main(dataset, theory_card, pdf)
  File "/home/felix/Physik/N3PDF/FK/runcards/runcardsrunner/cli/run.py", line 67, in main
    run_dataset(runner)
  File "/home/felix/Physik/N3PDF/FK/runcards/runcardsrunner/cli/run.py", line 122, in run_dataset
    runner.generate_pineappl()
  File "/home/felix/Physik/N3PDF/FK/runcards/runcardsrunner/external/yad.py", line 57, in generate_pineappl
    out = yadism.output.Output.load_tar(self.output)
  File "/home/felix/.cache/pypoetry/virtualenvs/runcardsrunner-0zbAcuub-py3.10/lib/python3.10/site-packages/yadism/output.py", line 469, in load_tar
    results.append(Result.from_document(res))
  File "/home/felix/.cache/pypoetry/virtualenvs/runcardsrunner-0zbAcuub-py3.10/lib/python3.10/site-packages/yadism/esf/result.py", line 161, in from_document
    sup = super().from_document(raw)
  File "/home/felix/.cache/pypoetry/virtualenvs/runcardsrunner-0zbAcuub-py3.10/lib/python3.10/site-packages/yadism/esf/result.py", line 41, in from_document
    new_output = cls(raw["x"], raw["Q2"], raw["nf"])
TypeError: EXSResult.__init__() missing 1 required positional argument: 'nf'

which might well be a problem in yadism rather then here ...

scarlehoff commented 2 years ago

Problem solved in https://github.com/NNPDF/runcards/commit/26a4d9b1cde492dddbdd03cbef3a59c088c34778, if you @scarlehoff and @felixhekhorn agree, we can merge :)

Let me try a few edge cases so I can keep complaining :P

alecandido commented 2 years ago

@felixhekhorn I know, I've just found as well, I'm debugging it. The problem seems to be that is not able to load the output (that by the way I switched to tar, but the problem is just the same with YAML), since it seems is missing that piece of information on nf.

But then, I believe we're not dumping it correctly, most likely at the level of individual ESF. I guess I used too much PineAPPL in all latest projects, and that one have not been maintained recently...

alecandido commented 2 years ago

@felixhekhorn I found the bug, and I'm releasing a hotfix: it was related to the usage of super() in a classmethod:

    @classmethod
    def from_document(cls, raw):
        sup = super().from_document(raw)

Essentially, super() game consist in tricking the MRO (method resolution order) binding still to the same instance. Being a classmethod, the instance was cls, not self, so what was happening was the following:

  1. when constructing sup it was calling ESFResult.from_document() as expected
  2. but, the first argument passed was not ESFResult, instead it was EXSResult itself

So, when we were constructing a EXSResult instance, was first constructing another EXSResult instance, but using ESFResult.from_document() instead, and then constructing the final one from there. It has always been a problem, but it was not crashing before because ESFResult.__init__() and EXSResult.__init__() had the same signature, until recently when I added Bjorken y as a further input for the second... thus the bug arose...

Could you please add a test in yadism to check something like that?

alecandido commented 2 years ago

In the meanwhile, the fixed version v0.11.3 has been released, and I'm now running poetry update.

alecandido commented 2 years ago

@scarlehoff I fixed or edge case above, can you check if you are now able to find other ones?

Otherwise, I would merge.

alecandido commented 2 years ago

Thank you for reviewing!