Open codekow opened 1 year ago
I have the same issue
Calling StartNotebookInstance
does not actually start a stopped notebook instance. What it does is create a brand new notebook instance (with the latest ML libraries) and connects to the previously existing ML storage volume (in EFS). Amazon SageMaker's APIs are awkward and don't allow the user to fully control their storage volume or supply the EFS volume ID when calling CreateNotebookInstance
, which would be a more elegant and RESTful way of doing this instead of the Start/Stop mess.
Note that there is already a Status.notebookInstanceStatus
field. Maybe call this new field Spec.Started
and make it a boolean field (ala "Enabled") instead of calling it Spec.State
and having started
and stopped
be the values?
Note that there is already a
Status.notebookInstanceStatus
field. Maybe call this new fieldSpec.Started
and make it a boolean field (ala "Enabled") instead of calling itSpec.State
and havingstarted
andstopped
be the values?
I like this solution, with one caveat: let's be sure to limit this approach of controlling the latent state of the notebook via a boolean flag to only controlling whether a notebook can be considered to be running or not; let's not try to also control for other states like Pending
, Stopping
, Failed
, Deleting
, and Updating
.
Also, consider calling this flag something like Spec.Running
to avoid ambiguity with the Started
value of NotebookInstanceState
Starting and stopping a notebook instance is similar to start/stop of an EC2 instance and was considered as a data plane operation. Such operations have not been supported in ACK controllers before. Are we considering changing this tenant?
Happy to discuss this in more detail so we look into a generic solution for all services and maybe other API operations like retry
@codekow @dmarcus-wire Can this be implemented in the higher level application or service?
Please let us know if this is a blocking feature and if there specific timelines you are looking to use this. Thank you
@surakota While that is true for most things, because SageMaker doesn't have a way of updating notebook instances, I figured we could take out an update code path using start/stop.
Thoughts?
Best, -jay
On Tue, Jan 24, 2023, 3:25 PM Suraj Kota @.***> wrote:
Starting and stopping a notebook instance is similar to start/stop of an EC2 instance and was considered as a data plane operation. Such operations have not been supported in ACK controllers before. Are we considering changing this tenant?
Happy to discuss this in more detail so we look into a generic solution for all services and maybe other API operations like retry
@codekow https://github.com/codekow @dmarcus-wire https://github.com/dmarcus-wire Can this be implemented in the higher level application or service?
Please let us know if this is a blocking feature and if there specific timelines you are looking to use this. Thank you
— Reply to this email directly, view it on GitHub https://github.com/aws-controllers-k8s/community/issues/1635#issuecomment-1402591912, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACAWP6ZOB3JO4IDL5YJYZ3WUA3E3ANCNFSM6AAAAAAT7HK7DE . You are receiving this because you commented.Message ID: @.***>
@surajkota The idea is to declaratively set the state of a NotebookInstance
.
If the aws
cli can change the state, the assumption is that the k8s controller could do it also.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Provide feedback via https://github.com/aws-controllers-k8s/community.
/lifecycle stale
/bump
Issues go stale after 180d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 60d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Provide feedback via https://github.com/aws-controllers-k8s/community.
/lifecycle stale
Issues go stale after 180d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 60d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Provide feedback via https://github.com/aws-controllers-k8s/community.
/lifecycle stale
Is your feature request related to a problem?
There is no way to start or stop Sagemaker notebook instances via CR.
Describe the solution you'd like
Like:
Describe alternatives you've considered
Run a k8s job to run
aws
cli