Parsl / parsl

Parsl - a Python parallel scripting library
http://parsl-project.org
Apache License 2.0
484 stars 194 forks source link

Remove channels #3515

Open benclifford opened 2 months ago

benclifford commented 2 months ago

Overview

Parsl has a facility called Channels intended to support use cases such as running a workflow on your laptop but with task execution on a supercomputer. This facility never evolved beyond prototype stage, and effort from University of Chicago and University of Illinois was rapidly distracted into the funcX project which aims (with substantially more code and developer time) to provide roughly the same facility - now named Globus Compute and developed by professional programmers rather than an academic research team.

The presence of channels as quasi-abandonware inside the modern Parsl codebase is a painful drain on both user and developer time: Users continue to be fooled into believing that this abandoned prototype facility, rather than the results of the funcX project, should be used for remote execution use cases. Developer time continues to be taken up dealing with the intricate lacing of channel handling through the core codebase.

Channels should be removed from Parsl, with the default behaviour of LocalChannel becoming the only behaviour.

Proposed timeline

I propose this timeline as a default, unless a consensus forms for some other timeline. In the absence of any other consensus forming in the comments of this issue, I'll make this timeline happen.

now .. 7th August 2024: people paying attention to issues get to comment on this issue, including offering entirely different alternative paths forward 7th August 2024: channel code is marked as going away in the codebase, in some way that is hard to avoid for users (for example, renaming all the user-facing channel classes). Users should be encouraged to visit this issue and comment, and to seek alternatives such as Globus Compute. 7th November 2024: channel code is removed from Parsl

Why can't this become an "abandoned component sitting in a directory"

The channel facility is not componentised - it contains prototype quality code inside the core of Parsl

What about the AdHocProvider which attempts to use multiple channels to run on a cluster which has no resource manager?

This should go away too - without channels it is of little use. Users should be encouraged to find other ways to run a cluster without any resource manager. It should not be parsl's job to manage this aspect of a user's cluster.

What if someone wants to take on Channel tech sponsorship?

This could happen, but I think it's unlikely.

The tech sponsor would need to be responsible for tidying up the architecture and implementing those fixes, with noise such as "why are channels initialised or not based on the presence of script_dir attribute on a provider?", test implementations and fixups such as testing parsl without shared file systems in a channel-like environment, and substantially more fundamental questions such as "how is htex supposed to work when channels do not provide a channel for htex network connections?".

For the more fundamental problems, this starts to sound like making a re-run of Globus Compute, which the funcX project has shown can occupy several full time developers for several years. I think probably that tech sponsors time would be better places working to get Parsl playing nicely with other remote execution technology - for example Globus Compute.

I just tagged a load of issues with I just tagged a bunch of issues with https://github.com/Parsl/parsl/labels/channels to give an overview of the sort of stuff a Channel tech sponsor would need to address.

An indicator for a successful channel tech sponsor would be substantial progress on these issues by the removal date in the timetable above.

yadudoc commented 2 months ago

@benclifford I just wanted to put my vote here to remove channels entirely.

From the user perspective the only folks I think we need to make sure know of the change would be: 1) folks using SSHChannels ought to gently pushed to make a switch over to globus-compute. 2) AdHocChannels will have no alternative really beyond your recommendation to use a cluster resource manager (slurm..)

astro-friedel commented 2 months ago

@benclifford I agree that channels are no longer needed and can be very finnicky to use.