Closed calvin0112 closed 2 years ago
@calvin0112 this might be some bug from processing job based on issue #2656, I already reached out to our internal team about this.
At the meanwhile, we introduced a new way to construct step, and you can give it a shot to see if it works?
from sagemaker.workflow.pipeline_context import PipelineSession
session = PipelineSession()
processor = XGBoostProcessor(..., sagemaker_session=session)
step_args = processor.run(code=..., source_dir=..., arguments=....)
step_sklearn = ProcessingStep(
name="MyProcessingStep",
step_args=step_args,
)
In summary, we introduced the PipelineSession
. This special session does not trigger a processing job immediately when you call processor.run
, instead, it captures the request arguments required to run a processing job, and delegate it to the processing step to start the job later during pipeline execution.
Let us know.
closing this issue for now, please re-open to let us know if you have any other concern.
Hi,
I'm using XGBoostProcessor from the SageMaker Python SDK for a ProcessingStep in my SageMaker pipeline. When running the pipeline from a Jupyter notebook in SageMaker Studio, I'm getting the following error:
This is from the script runproc.sh, which is generated by XGBoostProcessor. It looks like the script is trying to go to the directory "/opt/ml/processing/input/code/" to unpack the code to run for the processing but can't find the directory. Here is my Python code for my pipeline:
The script "train_something.py" is the code that I need to run for the processing step, and BASE_DIR is the directory with the dependencies.
I tried adding a ProcessingInput with "/opt/ml/processing/input/code" as the destination for the RunArgs, but it didn't help:
With the ProcessingInput, I'm still getting the same error. I've confirmed that the script runproc.sh and the code archive sourcedir.tar.gz are in the S3 bucket.
I would appreciate any help with this. I found an issue regarding the broken integration between FrameworkProcessor and ProcessingStep (https://github.com/aws/sagemaker-python-sdk/issues/2656). Is it related?
Thanks, C