aws / sagemaker-spark

A Spark library for Amazon SageMaker.
https://aws.github.io/sagemaker-spark/
Apache License 2.0
300 stars 128 forks source link

Sagemaker - Java Errors while running notebook in a VPC #134

Open gowthamprabhu opened 3 years ago

gowthamprabhu commented 3 years ago

System Information

Describe the problem

Sagemaker throws the exception as below while running notebook in a VPC Java gateway process exited before sending its port number.

The same spark code works fine in the notebook running without VPC using conda python3 kernel.

userkkw commented 3 years ago

I have the same issue

annamykcode commented 2 years ago

Same for me

annamykcode commented 2 years ago

I found another way how to work with Spark in Sagemaker But found the local_pyspark_example.ipynb in Sample Notebooks Sagemaker Processing chapter. And there they propose to solve the issue as:

  1. If you see an exception in running the cell above similar to this - Exception: Java gateway process exited before sending the driver its port number, restart your JupyterServer app to make sure you're on the latest version of Studio.

  2. If you are running this notebook in a SageMaker Studio notebook, run the above cell as-is. If you are running on a SageMaker notebook instance, replace com.amazonaws.auth.ContainerCredentialsProvider with com.amazonaws.auth.InstanceProfileCredentialsProvider.