HEPData / hepdata_lib

Library for getting your data into HEPData
https://hepdata-lib.readthedocs.io
MIT License
15 stars 37 forks source link

Data licence addition #257

Closed ItIsJordan closed 3 months ago

ItIsJordan commented 5 months ago

📚 Documentation preview 📚: https://hepdata-lib--257.org.readthedocs.build/en/257/

ItIsJordan commented 5 months ago

This PR is intended to close #245.

codecov[bot] commented 5 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 90.37%. Comparing base (1f006c8) to head (2ce0035).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #257 +/- ## ========================================== + Coverage 89.78% 90.37% +0.58% ========================================== Files 5 5 Lines 1087 1112 +25 Branches 242 251 +9 ========================================== + Hits 976 1005 +29 + Misses 80 78 -2 + Partials 31 29 -2 ``` | [Flag](https://app.codecov.io/gh/HEPData/hepdata_lib/pull/257/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=HEPData) | Coverage Δ | | |---|---|---| | [unittests-3.10](https://app.codecov.io/gh/HEPData/hepdata_lib/pull/257/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=HEPData) | `90.37% <100.00%> (+0.58%)` | :arrow_up: | | [unittests-3.11](https://app.codecov.io/gh/HEPData/hepdata_lib/pull/257/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=HEPData) | `90.37% <100.00%> (+0.58%)` | :arrow_up: | | [unittests-3.6](https://app.codecov.io/gh/HEPData/hepdata_lib/pull/257/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=HEPData) | `90.09% <100.00%> (+0.60%)` | :arrow_up: | | [unittests-3.7](https://app.codecov.io/gh/HEPData/hepdata_lib/pull/257/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=HEPData) | `90.09% <100.00%> (+0.60%)` | :arrow_up: | | [unittests-3.8](https://app.codecov.io/gh/HEPData/hepdata_lib/pull/257/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=HEPData) | `90.19% <100.00%> (+0.59%)` | :arrow_up: | | [unittests-3.9](https://app.codecov.io/gh/HEPData/hepdata_lib/pull/257/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=HEPData) | `90.19% <100.00%> (+0.59%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=HEPData#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

ItIsJordan commented 5 months ago

I just noticed the coverage. Sorry, will look into that this week.

GraemeWatt commented 5 months ago

@clelange : when testing Jordan's branch, I noticed that hepdata_lib already defines a default license:

https://github.com/HEPData/hepdata_lib/blob/ab8bbef609eaa8b012612520646825530ca50e97/hepdata_lib/__init__.py#L509-L517

and writes it as data_license in the first YAML document of the submission.yaml file:

https://github.com/HEPData/hepdata_lib/blob/ab8bbef609eaa8b012612520646825530ca50e97/hepdata_lib/__init__.py#L613

This has been the behaviour ever since the first hepdata_lib version from 2018. However, data_license has never been supported in the relevant additional_info_schema.json. The JSON schema only support a data_license in the submission_schema.json (data tables) or a license in the additional_resources_schema.json (additional resources). It would be difficult to modify our database model to support a license for a whole submission rather than just individual tables and resource files. Moreover, the default hepdata_lib license (CC BY 4.0) is slightly more restrictive than the default HEPData license (CC0).

Therefore, I removed the writing of a default license in commit 2cf90464ec5a05e122504019d37b444da6890e51. We could revert that commit if you disagree with the removal. However, I think it will be confusing if we provide new methods to write license information for data tables and additional resources, but keep a redundant default license for the whole submission that will just be ignored when uploading to HEPData.

clelange commented 5 months ago

@GraemeWatt - I think there should be a default license that is applied, and I agree that this should be CC0 since that's the default stated on the HEPData web page. I don't remember why we chose CC BY 4.0 back then. As part of this PR, I think it would be good to state in the documentation that this license is applied by default and that this is the default for all HEPData records as well as mentioning that users can change to their own license (with the newly added functionality of this PR).

GraemeWatt commented 4 months ago

Looks like the CI has started failing when building a wheel for wrapt v1.12.1, installed by astroid v2.6.6 via pylint v2.9.6. One solution might be to address #234 by upgrading pylint to a more recent version.

GraemeWatt commented 4 months ago

I opened a PR #260 that removes the pin on Pylint and fixes the build, so better to wait for that PR to be merged.

clelange commented 3 months ago

Looks like this is now good to go? @ItIsJordan

ItIsJordan commented 3 months ago

Should be ready now, yes! @clelange