easybuilders / easybuild-framework

EasyBuild is a software installation framework in Python that allows you to install software in a structured and robust way.
https://easybuild.io
GNU General Public License v2.0
148 stars 202 forks source link

Rename `source` step #4627

Open Flamefire opened 2 weeks ago

Flamefire commented 2 weeks ago

This is a follow-up discussion to https://github.com/easybuilders/easybuild-framework/pull/4624

That PR moved the verification of the checksums for sources and patches to the fetch-step.

That results in the "source"-step to only extract the sources.

We have for (almost) all steps now a 1:1 correspondence of (user-facing) step names and their method in the EasyBlock class: https://github.com/easybuilders/easybuild-framework/blob/e0e04fb773a5cee26669211f563696da94455cfd/easybuild/framework/easyblock.py#L4043-L4048

However for the (new) "source"-step this isn't true any longer: https://github.com/easybuilders/easybuild-framework/blob/8569344a85295f691d30fac75e2c7611c478d18d/easybuild/framework/easyblock.py#L4054

Note that even prior to the change the status (printed to stdout) was "unpacking":

== fetching files...
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== configuring...
== building...
== testing...
== installing...

It could be argued that "source" (for use in e.g. --stop=source) was fitting because it verified the sources (&patches) and then extracted them, even though the former wasn't explicitly indicated.

However after the change I think "source" is no longer appropriate and we should follow the method names at least for consistency and rename this step to "extract". I'd argue that --stop=extract is as clear as --stop=configure and similar.

An alternative would be "unpack" to match the status-message although it then won't match the method name extract_step

This change can only be for 5.x as it is a breaking change and the above change is only for that branch.

I'm pretty sure we already have a check for valid step names, hence any use of the old name "source" would yield a clear message that it is an invalid value and to choose from the valid set of values. If not we should enhance that error message to do that.

With that the change will only cause some required change in habits for people doing --stop=source. This was however often just a workaround for --fetch not verifying the checksums, so there is a better alternative for that.

Theoretically there could be easyconfigs using skipsteps=["source"] although I can hardly imagine a reason for that. We currently have 4 such official ECs in the __archive__ and I'm not sure that was ever really OK as e.g. it meant that the checksums were not verified.

No current official ECs use this.

Requesting input from @boegel, possibly putting this up on the next confcall agenda.

boegel commented 2 weeks ago

The impact of renaming the source step to (say) extract is broader than just --stop though. I'm fine with breaking --stop source in EasyBuild 5.0 and letting figure out that they should be using --stop extract instead. For the theoretical cases of skipsteps = ['source'], I'm fine with it too as long as the error reporting is clear: it should state that source is not a known step name, and produce the list of valid step names; to be checked if that's the case or not.

We should also take into account hooks though: renaming the source step to extract probably also implies that pre_source_hook and post_source_hook hook functions need to be renamed to pre_extract_hook and post_extract_hook. That fallout is slightly larger, but perhaps still acceptable (again, if the error reporting is sufficiently clear). The latter may be handled more user-friendly by deprecating pre_source_hook and post_source_hook, and keep using them along with a warning unless the *_extract_hook counterpart is also defined (in that case, there should still be a warning that *_source_hook is flat out ignored, ideally).

There may be other "hidden" impact of this, we briefly discussed this during the last EasyBuild 5.0 sync meeting.

@lexming @branfosj Do you recall anything else that may be relevant here?

branfosj commented 2 weeks ago

EB checks that *_hook functions are for known steps.

If you need to use the same hooks with multiple versions of EB at once then you can use the following to only put a hook in scope based on the EasyBuild version:

from easybuild.tools.version import VERSION
if VERSION >= "4.8.1":
    def pre_build_and_install_loop_hook(ecs, *args, **kwargs):
        pass
boegel commented 2 weeks ago

@branfosj That seems relevant to put in our documentation, in particular in the page/section that covers breaking changes in EasyBuild 5.0, incll. the rename of source to extract

boegel commented 2 weeks ago

@Flamefire As discussed during today's EasyBuild conf call, there's no opposition against this (unless there's more fallout that we haven't anticipated so far), so please open a PR to rename the source step to extract.

branfosj commented 2 weeks ago

@branfosj That seems relevant to put in our documentation, in particular in the page/section that covers breaking changes in EasyBuild 5.0, incll. the rename of source to extract

We should add the example now - see https://github.com/easybuilders/easybuild-docs/pull/267

We can then link to that from the EB5 docs.