canonical / data-science-stack

Stack with machine learning tools needed for local development.
Apache License 2.0
11 stars 3 forks source link

Feedback on implementing the dss commands with timeout #63

Open NohaIhab opened 5 months ago

NohaIhab commented 5 months ago

Bug Description

Implementing the dss initialize and dss create-notebook commands with timeout affects the dss user experience, where the timeout failure logs commands are sometimes not helpful. For example as mentioned here for dss initialize and here for dss create-notebook.

It's necessary to consider, with the help of the UX team, whether the commands should have timeout and how to make the status of dss visible to users.

To Reproduce

Install dss from source and run the commands

KUBECONFIG=~/.kube/config
pip install .
dss initialize --kubeconfig $KUBECONFIG
syncronize-issues-to-jira[bot] commented 5 months ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5512.

This message was autogenerated

ca-scribner commented 5 months ago

Two options come to mind:

Remove the timeout entirely - hold the terminal until commands complete or the user cancels

Why do we need the timeout? All it does really is set a fixed time after which the command dies. Users can already break the command themselves with ctrl+c.

What if we remove the timeout but also handle the SIGINT from ctrl+c, and maybe print some debug info if we handle SIGINT? That avoids the issue of setting a timeout differently for different people

Run commands async by default but add --wait for synchronous mode

We can make dss initialize (and other commands) being async by default (they do not wait for things to deploy), but then add an argument like --wait that'll make them wait for everything to complete (like the default is now, maybe with no timeout?). It is definitely convenient sometimes to have the wait behaviour, so I wouldn't want to remove it entirely

ca-scribner commented 5 months ago

Regardless of how we do this, if the commands are waiting on completion we should be providing the user with good status details so they can see progress

NohaIhab commented 4 months ago

Synced with @misohu : This also applies to dss remove, but in that case in PR #51 I have not implemented timeout, because the behavior of kubectl delete is that it waits on the resource to be deleted by default. However, without having timeout there is no visibility for the user as to what is going on while the command hangs. We should decide whether to include timeout, and be consistent in all the commands, or otherwise have a strong argument of why some commands have timeout and others don't.

misohu commented 4 months ago

Synced with UX team.

Following are the blocking commands which should not have timeout

Following are the non blocking commands which can happen behind the scenes

With these change we introduce states of the notebook lifecycle which will be tracked with dss list namely: