HEPData / hepdata

Repository for main HEPData web application
https://hepdata.net
GNU General Public License v2.0
40 stars 11 forks source link

hepdata yoda download seems to provide an invalid yoda file #748

Closed kratsg closed 5 months ago

kratsg commented 5 months ago

For the HEPData record, I noticed that v6b when downloading the yoda file produces this format:

$ cat /Users/kratsg/Downloads/HEPData-ins1495243-v2-Figure_6b.yoda
BEGIN YODA_ESTIMATE1D_V3 /REF/ATLAS_2017_I1495243/d06-x01-y01
IsRef: 1
Path: /REF/ATLAS_2017_I1495243/d06-x01-y01
Title: doi:10.17182/hepdata.77436.v2/t6
Type: Estimate1D
---
Edges(A1): [2.500000e+01, 4.000000e+01, 5.500000e+01, 7.500000e+01, 1.000000e+02, 1.500000e+02, 1.000000e+03]
ErrorLabels: ["stat", "sys"]
# value         errDn(1)        errUp(1)        errDn(2)        errUp(2)
nan             ---             ---             ---             ---
1.920000e-02    -8.000000e-04   8.000000e-04    -1.600000e-03   1.600000e-03
1.790000e-02    -7.000000e-04   7.000000e-04    -8.000000e-04   8.000000e-04
1.240000e-02    -5.000000e-04   5.000000e-04    -9.000000e-04   9.000000e-04
4.300000e-03    -3.000000e-04   3.000000e-04    -4.000000e-04   4.000000e-04
1.400000e-03    -1.000000e-04   1.000000e-04    -1.000000e-04   1.000000e-04
2.200000e-05    -2.000000e-06   2.000000e-06    -3.000000e-06   3.000000e-06
nan             ---             ---             ---             ---
END YODA_ESTIMATE1D_V3

Yet this seems to be different from the yoda files that rivet has so I'm trying to understand where the disconnect is.

Note that the yoda file downloaded from hepdata is not readable by yoda at all, so I'm wondering if there's a parsing error somewhere that injects or generates the incorrect yoda file.

GraemeWatt commented 5 months ago

The "YODA" download option on HEPData now gives the new YODA2 format (https://arxiv.org/abs/2312.15070) for use with the upcoming Rivet 3.2.0 release and the current YODA 2.0.0 release from 22nd December 2023. The HEPData download matches the ATLAS_2017_I1495243.yoda.gz file from the release-3-2-x branch of the Rivet repository. The "YODA1" download option on HEPData returns the legacy YODA format for use with the Rivet 3.1.x releases and YODA 1.x.

This breaking change was announced in my talk (bottom of slide 9) at the pyhf workshop on 5th December 2023, so I'm surprised that you're unaware. It was tweeted on 17th November 2023. After the change was deployed, a notifying banner was displayed at the top of all HEPData web pages for more than one month from 17th November 2023 to 19th December 2023. The documentation at https://www.hepdata.net/formats#data_file_formats was updated and a new release of the hepdata-cli tool was made to support both yoda and yoda1 downloads.

@20DM : I'm not sure what else can be done to notify HEPData/Rivet users of the change, but maybe an email could be sent to the rivet-announce mailing list? Also, it looks like the docs linked from the top-left box at https://yoda.hepforge.org still point to YODA 1.9.9 rather than YODA 2.0.0.

kratsg commented 5 months ago

This part I was partially aware of. I did not realize you were changing the output of "YODA". Would it not be easier if we instead called it "YODA2" instead of "YODA"? At least the version is more indicative. In any case, I was mostly surprised HEPData dropped any warnings or banners given that yoda's hepforge seems to indicate yoda2 as alpha-release only... So is HEPData just being eager, or is yoda's website out of date?

As far as I know, nobody is using yoda2 in the collaborations.

GraemeWatt commented 5 months ago

Thanks for the feedback. I've added back the explanatory banner message at the top of all HEPData web pages. I'll keep this banner message at least until the release of Rivet 3.2.0. The choice of "YODA" and "YODA1" rather than "YODA2" and "YODA1" was the preference of the Rivet/YODA developers (@20DM) and reflects the class names in the latest YODA software. I think the idea is that YODA2 will become the default YODA format and YODA1 will become deprecated and eventually dropped. But for now, it should still be easy to obtain the YODA1 format from HEPData, although a change yodayoda1 might be needed in download scripts. The YODA homepage shows that YODA 2.0.0alpha was released on 2023-10-01 and YODA 2.0.0 (non-alpha) was released on 2023-12-22. The default GitLab branch is now release-2-0-x rather than release-1-9-x. I think you're correct that HEPData is early in adopting YODA2 and it is probably not yet used within the collaborations, but the work on the YAML → YODA2 conversion was already completed last year by @20DM and so we decided to release it. In case of further questions on the expected YODA1 → YODA2 transition, please contact the Rivet/YODA developers directly.