HEPData / hepdata_lib

Library for getting your data into HEPData
https://hepdata-lib.readthedocs.io
MIT License
15 stars 37 forks source link

Possible workspace.json file submission + symerrors submission #252

Closed AndriiPovsten closed 5 months ago

AndriiPovsten commented 5 months ago

Is it possible to submit the workspace.json as a 'one separate submission' for the HEPData? Since looking at the examples of submitted data the workspace files are only in the additional_resources in the submission.yaml file. And is ROOT a mandatory dependency for the hepdata_lib? Also for me the uncertainties submission is a bit tricky. I was using this tutorial which provides a nice explanation of how to create the tables for the submission. And the uncertainties, were added by hist.intervals.poisson_interval(s, s2) , not just as a list of numbers. In my case I only have a symerrors (+/-) which are located in a list/dictionary, what is the way of submitting them?

Or should it be better just to have a values already as a dictionary? Since the:

unc1 = Uncertainty("A symmetric uncertainty", is_symmetric=True) unc1.values = f["W+jets"] tab1d.add_uncertainty(unc1)



Gives such an error:
 `'Table' object has no attribute 'add_uncertainty'`
@clelange 
GraemeWatt commented 5 months ago

Thanks for the message. Let me try to answer your questions.

Is it possible to submit the workspace.json as a 'one separate submission' for the HEPData? Since looking at the examples of submitted data the workspace files are only in the additional_resources in the submission.yaml file.

I assume that the workspace.json file is in the HistFactory JSON format. No, it can't be submitted in place of a HEPData table, but it can only be attached as additional_resources to either a whole submission or to an individual table, with multiple HistFactory JSON files packaged in an archive file. There was an idea (HEPData/hepdata#164) to provide native support for HistFactory JSON files, and work was started on the implementation, but the idea was later abandoned as it was too complicated to be workable.

Another idea (#98) would be for hepdata_lib to be able to convert a workspace.json file into the normal HEPData YAML format, albeit with some information loss. A basic converter (https://github.com/lukasheinrich/hf2hd-demo) was written by @lukasheinrich back in 2017. The implementation in hepdata_lib would need to be done by someone familiar with both the HistFactory JSON and the HEPData YAML format, along the lines of PR #243 which added functionality for hepdata_lib to convert from Scikit-HEP histograms.

And is ROOT a mandatory dependency for the hepdata_lib?

At the moment, yes, but there's another long-standing open issue #99 to allow hepdata_lib to be used without ROOT.

Also for me the uncertainties submission is a bit tricky. I was using this tutorial which provides a nice explanation of how to create the tables for the submission. And the uncertainties, were added by hist.intervals.poisson_interval(s, s2) , not just as a list of numbers. In my case I only have a symerrors (+/-) which are located in a list/dictionary, what is the way of submitting them?

That tutorial is specific to input data in the form of Scikit-HEP histograms. Check the other examples for more general usage. See also the documentation at https://hepdata-lib.readthedocs.io/en/latest/usage.html and in particular the Uncertainties section.

  • I mean, is it possible to create the symerror section, something like this:

      - value: 4.71
        errors:
          - symerror: 2.18
      - value: 30.95
        errors:
          - symerror: 6.61 
      - value: 73.35
        errors:
          - symerror: 15.77

Or should it be better just to have a values already as a dictionary? Since the:

unc1 = Uncertainty("A symmetric uncertainty", is_symmetric=True)
unc1.values = f["W+jets"]
tab1d.add_uncertainty(unc1)

Gives such an error: 'Table' object has no attribute 'add_uncertainty' @clelange

Try something like:

var1 = Variable("W+jets", is_binned=False)
var1.values = [4.71, 30.95, 73.35]
unc1 = Uncertainty("A symmetric uncertainty", is_symmetric=False)
unc1.values = [2.18, 6.61, 15.77]
var1.add_uncertainty(unc1)

The add_uncertainty function is a method of a Variable object not a method of a Table object.

Please close this issue once your problems are resolved.

AndriiPovsten commented 5 months ago

Hi Graeme, thank you a lot! I just have one additional question, is it possible to submit the tables with submission.yamlfile from terminal (inside the workflow pipeline)? Or the best way is to create a .zip file and upload directly to the HEPData Sandbox?

GraemeWatt commented 5 months ago

Yes, you can use the hepdata-cli tool (pip install hepdata-cli) to upload from the command line or from Python. See Example 7, Example 8 and Example 9 in the README.md file.