flav-io / flavio

A Python package for flavour physics phenomenology in the Standard model and beyond
http://flav-io.github.io/
MIT License
71 stars 62 forks source link

fix loading of fit_wc_function code #54

Closed peterstangl closed 6 years ago

peterstangl commented 6 years ago

@DavidMStraub I think the solution to use OrderedDict introduced in commit b73f9493abdd5d58265db09ff91fdff6eae3e46e is extremely useful. In particular, it allows to load a YAML string where the fit_wc_function requires the import of an external module or function like e.g. the function sqrt() as in the following example:

name: my test fit with external module
observables:
    - [<dBR/dq2>(B0->K*mumu), 15, 19]
input_scale: 1000
fit_wc_eft: SMEFT
fit_wc_basis: Warsaw
fit_wc_function:
  code: |
    from math import sqrt
    def f(C9, C10):
      return {'C9_bsmumu': 10 * sqrt(C9),
              'C10_bsmumu': 30 * C10}

Unfortunately, due to the changes in commit 4007b8cd3d92940ea328382815108895e7df40e0, this doesn't work anymore.

Before this commit, exec was called as

exec(s, namespace)

such that the global and the local namespaces for the executed code have been identical and given by the OrderedDict namespace.

After the commit, the exec call has been changed to

exec(s, globals(), namespace)

Now, this has the problem that global and local namespaces are not identical anymore and using the above example YAML string, the function f would not be able to find the function sqrt in its global namespace (which is now set to globals()), because sqrt is only defined in the local namespace namespace. So loading of the above YAML string leads to a NameError after commit 4007b8cd3d92940ea328382815108895e7df40e0 while loading the same YAML perfectly worked before.

But what is the reason to change exec(s, namespace) to exec(s, globals(), namespace)? The former solution actually lead to the following problem: If a fit_wc_function defined by exec(s, namespace) contains a call to any function in the globals() namespace, like e.g. abs(), then pickling fit_wc_function and unpickling it again leads to an unpickled fit_wc_function that is not able to find abs() anymore. Using the global namespace globals() in the exec call is an apparent solution. By using exec(s, globals(), namespace), the fit_wc_function can still find abs() after pickling and unpickling.

However, this is not a real solution since it only solves the problem for functions in the globals() namespace. At the same time, it introduces the problem described above: any function imported to the local namespace namespace, like sqrt() in the example above, leads to a NameError when loading the YAML string.

So I think one has to use

exec(s, namespace)

This then prevents the functions to be properly pickled and unpickled, which is in this case obviously not very well supported by dill.

To solve all those issues, I suggest to simply not pickle any functions defined by exec in the wc_function_factory, but to save their defining YAML instead as done before commit 4007b8cd3d92940ea328382815108895e7df40e0, with the modification proposed in comment https://github.com/flav-io/flavio/pull/50#issuecomment-411678529. This is done in the present PR.

DavidMStraub commented 6 years ago

The reason for changing exec(s, namespace) to exec(s, globals(), namespace) was because in the former case, return locals() in wc_function_factory gave a NameError. Does this not happen here?

coveralls commented 6 years ago

Coverage Status

Coverage increased (+0.03%) to 90.742% when pulling c467217423880a9642937ea3bdec3c76e5370f97 on peterstangl:wc_function_factory_2 into 1e5f412a6d4ce02db0e9b87505f5e51ef4493192 on flav-io:master.

peterstangl commented 6 years ago

Yes, locals() is just another function in the globals() namespace like abs(). I just decided to use abs() in the example above because there are already so many mentions of global and local namespaces and I thought another locals() would just make it more confusing.

So the answer is yes, it does not happen, because the functions defined by exec(s, namespace) that contain locals() are not pickled. Loading is no problem in this case, only pickling and unpickling.

DavidMStraub commented 6 years ago

OK I am out of counterarguments ;)

But can you please add a few comments to the new load and dump? All the new attributes like for the "original function" are not self-explanatory.

Thanks!

peterstangl commented 6 years ago

I've added some comments. Do you think they are enough to easily understand the code?