easybuilders / easybuild

EasyBuild - building software with ease
http://easybuild.io
GNU General Public License v2.0
456 stars 142 forks source link

enhance documentation of checksums easyconfig parameter #853

Closed Flamefire closed 1 year ago

Flamefire commented 1 year ago

Add more examples of which formats are possible. Also contains not-yet implemented or broken features!

This is related to https://github.com/easybuilders/easybuild-framework/issues/4142, https://github.com/easybuilders/easybuild-framework/issues/4177, https://github.com/easybuilders/easybuild-framework/pull/4150, https://github.com/easybuilders/easybuild-framework/pull/4159, https://github.com/easybuilders/easybuild-framework/pull/4164

We have https://github.com/easybuilders/easybuild-framework/blob/e3681ae53628400096f3e29d58e787bf1173f27b/test/framework/type_checking.py#L184-L225

and https://github.com/easybuilders/easybuild-framework/blob/e3681ae53628400096f3e29d58e787bf1173f27b/test/framework/type_checking.py#L709-L719

which tests various formats for checksums.

We have code in get_checksum_for which supports checksums being a dict instead of a list: https://github.com/easybuilders/easybuild-framework/blob/e3681ae53628400096f3e29d58e787bf1173f27b/easybuild/framework/easyblock.py#L369

However this is not supported by the type-checking code and hence to_checksums will be called which would iterate over the dicts keys mistaking them for checksums!!! https://github.com/easybuilders/easybuild-framework/blob/e3681ae53628400096f3e29d58e787bf1173f27b/easybuild/framework/easyconfig/types.py#L475

Furthermore this is made more difficult by the YEB files which use a YAML format where tuples are not supported. E.g. a test file has this:

checksums: [[
    'be662daa971a640e40be5c804d9d7d10',  # default [MD5]
    '44332000aa33b99ad1e00cbd1a7da769220d74647060a10e807b916d73ea27bc',  # default (SHA256)
    ['adler32', '0x998410035'],
    ['crc32', '0x1553842328'],
    ['md5', 'be662daa971a640e40be5c804d9d7d10'],
    ['sha1', 'f618096c52244539d0e89867405f573fdb0b55b0'],
    ['size', 273],
]]

This is a checksum entry for a single file. Obviously the inner-most 2-element lists should be converted to tuples and the outer-most list should be a list. But it isn't clear what to do with the 2nd list: Are those alternative checksums where only one needs to match or additional checksums that all need to match? Both cases should be supported.

I would argue that an additional level can be used: checksums: [[['mainchecksum', 'altchecksum']]] The type conversion code can deduce that after the 2nd level of lists only tuples may be specified as a list (i.e. an AND) inside a list (already an AND) doesn't make sense, so the 2nd level list is an AND consisting only of a single element, which is redundant but ok.

And finally specifying None in a dict currently yields the same error as not specifying it, see https://github.com/easybuilders/easybuild-framework/issues/4142

So things to decide:

I see 2 ways here:

  1. dicts can only contain checksums/type-checksum-tuples or None
  2. allow full power, so that the value of a dict entry is handled the same as a "top-level entry" except it disallows more dicts.

As for the missing-key case: I'd handle it the same as not specifying it: Error when enforce_checksums is active else treat as matched.

NOT to be merged right away but should be used for discussion and once everything is decided this PR can be finalized.

boegel commented 1 year ago

@Flamefire Do you mind re-doing this PR in the new easybuild-docs repository, where we've ported the EasyBuild docs sources to MarkDown?

As soon as we switch https://docs.easybuild.md to the new documentation, we'll trash the whole docs directory here in the easybuild repo...

Flamefire commented 1 year ago

Oh, yes makes sense. Anyway this is but a draft meant for discussion.

Flamefire commented 1 year ago

Done: https://github.com/easybuilders/easybuild-docs/pull/104