aws / sagemaker-python-sdk

A library for training and deploying machine learning models on Amazon SageMaker
https://sagemaker.readthedocs.io/
Apache License 2.0
2.09k stars 1.13k forks source link

ScriptProcessor should expose the name of the processing job #1869

Open svpino opened 4 years ago

svpino commented 4 years ago

Describe the feature you'd like When creating a ScriptProcessor job, there's currently no way to obtain the job name generated for it. This name is generated when the run() function is called, but run() doesn't return anything.

Also, there's no public property exposed by ScriptProcessor to obtain this name.

How would this feature be used? Please describe.

Here is an example on how this feature could be implemented:

script_processor = ScriptProcessor(
        image_uri="",
        command=['python3'],
        base_job_name=f"sample-job",
        role=role,
        instance_count=1,
        instance_type='ml.c5.2xlarge',
    )

   # `run()` returns an instance of the job
   job = script_processor.run(...)
   print(job.get_current_job_name())

   # Or we could also access the current_job_name directly from the script_processor.
   print(script_processor.get_current_job_name())

Describe alternatives you've considered

As of right now, the only solution to this is by accessing _current_job_name() on ScriptProcessor. This is not ideal because _current_job_name() is supposed to be internal only.

metrizable commented 4 years ago

@svpino

Thank you for using Amazon SageMaker.

You're right. The _current_job_name of processor instances has been marked internal and, if job_name is not specified on invocation of run, it is generated on run invocation.

You cite one current work-around: access the internal attribute _current_job_name.

Another method that may work for you is to specify the job_name yourself and pass that into the run method. In that case, _current_job_name will be set to the value you pass in, as a no-op. You could then use this name to do further work such as describing the processing job.

We are always re-evaluating our backlog of features based on customer requests, so we appreciate the feedback on this feature.

Let us know if there is anything else we can be an assistance of.

svpino commented 4 years ago

Thanks for the response!

larroy commented 2 years ago

Any updates on this? This is very inconvenient.

larroy commented 2 years ago

I found a workaround, but we should expose a method:

processor.latest_job.job_name
Out[5]: 'marketing-sm-1651701978-3373-2022-05-04-22-06-18-675'
processor
Out[6]: <sagemaker.processing.Processor at 0x126b36550>