N3PDF / pdfflow

PDFflow is parton distribution function interpolation library written in Python and based on the TensorFlow framework.
https://pdfflow.readthedocs.io
Apache License 2.0
8 stars 0 forks source link

Alpha s #25

Closed marcorossi5 closed 4 years ago

marcorossi5 commented 4 years ago

I implemented the running coupling interpolation. The accuracy is perfect (at least on NNPDF31_nlo_as_0118). You can test it running python ../benchmarks/compare_accuracy_alphaS.py from the src directory.

I think I have to add yaml package in order to pass the test on this platform.

Anyway, I edited also something about pdfflow: I noted that some function signatures were preventing multiple queries to work. The grid function A had a signature [4,4,None,1], but if we want to allow the algorithm to work with multiple pids, we should change this into [4,4,None,None]. I made this little change in my commits.

Also, @scarlehoff I don't get the purpose of pyxfxQ2 functions in the PDF class. You use it to convert python inputs to a tensor, but the float_me function is called in both pyxfxQ2 and xfxQ2. Is that because we want xfxQ2 to be an entirely tensorflow function?

Let me know if you are satisfied with this implementation of alpha_s and how it performs on GPU. On my laptop it's faster than lhapdf only when the query points are more than 1e6. Moreover I see that executing the code eagerly, runs faster than in the graph mode. This is strange.

marcorossi5 commented 4 years ago

I've added yaml to the requirements into the setup.py file. I'm not sure this is sufficient to pass the checks. Could you please tell me what must be edited in order to install all dependencies? Thanks

scarlehoff commented 4 years ago

Also, @scarlehoff I don't get the purpose of pyxfxQ2 functions in the PDF class. You use it to convert python inputs to a tensor, but the float_me function is called in both pyxfxQ2 and xfxQ2. Is that because we want xfxQ2 to be an entirely tensorflow function?

Yes, this was my idea. I guess another possibility is to expose the python interface from tf.function (which gives you want) but we might want to digest the input in different ways in the future and this is the simplest interface that I came up with.

Let me know if you are satisfied with this implementation of alpha_s and how it performs on GPU.

I'll have a look.

On my laptop it's faster than lhapdf only when the query points are more than 1e6. Moreover I see that executing the code eagerly, runs faster than in the graph mode. This is strange.

Is this excluding the graph-compilation time?

marcorossi5 commented 4 years ago

I was forgetting... A few words on this implementation.

It’s really easy to understand since it’s basically a duplicate of pdfflow with a mono dimensional interpolation. The base class is always the PDF as before, but now it loads both the pdf member’s data and the metadata during initialization. I’ve added a new method called alphaSQ2 that takes a q2 query tensor and returns the interpolated alpha_s. The algorithm is the same as pdfflow, so it should be really easy for you to go through it. I duplicated almost all source files in order to keep the two things separate. I have uploaded the doc strings for every functions.

scarlehoff commented 4 years ago

I duplicated almost all source files in order to keep the two things separate

Maybe better than duplicate you can add a boolean flag to the arguments of the function (alpha_computation=False) and when it is called with True it goes to the alpha_s function. I think the impact in memory should be the same as having two different functions.

marcorossi5 commented 4 years ago

Is this excluding the graph-compilation time?

Yes. I load the graphs with a first dummy computation.

you can add a boolean flag to the arguments of the function (alpha_computation=False)

Do you mean in PDF init function? Since now it's done in every time you create a PDF object.

marcorossi5 commented 4 years ago

I introduced the flag. Now if you want to create a pdf object that loads alpha_s, the preferential way is to use: p = pdf.mkPDF(pdfname, DIRNAME, alpha_computation=True)

scarlehoff commented 4 years ago

No, I meant the interpolation, region_interpolation and functions file. The content of the alpha_s functions is a subset of the others so I would suggest making all the x-arguments optional and then a boolean flag for alpha_s.

And a check that if the alpha_s is set to True all the x-arguments must be None.

Since all calls for the alpha_s will always have their x-arguments set to None it should no add any overhead with respect to the way you are doing it now. The reason for doing it this way is that it's the only way to ensure that bugfix, optimizations, etc are immediately propagated.

marcorossi5 commented 4 years ago

We have fixed pdfflow and alphas algorithms to prevent retracing. Unfortunately calling @tf.function(experimental_relax_shapes=True) doesn't work properly and keeps retracing when input shapes change. For pdfflow we hadn't noticed the difference because it was faster than lhapdf even with retracings. However that wasn't the case with the alpha_s interpolation. Now we enforced the input shapes with @tf.function(input_signature= ...) and everything works properly.

The only thing left is to try if it's possible to impose a flag and reuse the same pdfflow function for the alpha_s computation as @scarlehoff suggested.

scarlehoff commented 4 years ago

In this commit I'm implementing what we talked about last Friday: https://github.com/N3PDF/pdfflow/pull/25/commits/46e56ed15e7fb27e0e7d9dc1924f9968c083da55

The idea is that since most of the code is common (in particular, alpha_s interpolation being a subset of a pdf interpolation), you can simply hide the parts that are not applicable behind a boolean flag that you pass to the function.

I've done it for the Subgrid class (so that the Alpha_s_subgrid class is not needed) but it should be possible with other functions as well. The main advantage of doing it in this way is for maintenance, as changes will automatically propagate (for instance, imagine that we want to pad with 5 values the grid in Q now, you only need to remember to change it in one place).

Let me know if you have any questions.

scarrazza commented 4 years ago

@marcorossi5 why did you close this?

marcorossi5 commented 4 years ago

I made a mistake with the commit tree and I wasn’t able to commit anymore in the desired way. I deleted and recreated the branch. Is it possible to reopen this pull request ?

scarrazza commented 4 years ago

Given that the branch name is the same, I believe you can create a new PR from that without collateral issues.