Open egelberg opened 2 years ago
Couple follow up questions: 1/ are you using EMR Studio? 2/ which instance types? 3/ can you share a screenshot (without sensitive info)
Couple follow up questions: 1/ are you using EMR Studio? 2/ which instance types? 3/ can you share a screenshot (without sensitive info)
Thanks for jumping on this!
1/ Nope, just EMR notebooks on EC2. I'm trying to run this in a PySpark
notebook to take advantage of the full cluster, not just the driver node
2/ Instance types:
3/ Here's some more specifics
EMR Configuration
aws emr create-cluster --os-release-label 2.0.20220912.1 --applications Name=Livy Name=Spark Name=JupyterEnterpriseGateway Name=Hadoop Name=JupyterHub Name=Hive Name=Pig Name=TensorFlow Name=Tez --ec2-attributes '{"InstanceProfile":"EMR_EC2_DefaultRole","SubnetId":"subnet-*****************","EmrManagedSlaveSecurityGroup":"sg-*****************","EmrManagedMasterSecurityGroup":"sg-*****************"}' --release-label emr-6.8.0 --log-uri 's3n://aws-logs-************-us-east-1/elasticmapreduce/' --instance-groups '[{"InstanceCount":4,"EbsConfiguration":{"EbsBlockDeviceConfigs":[{"VolumeSpecification":{"SizeInGB":64,"VolumeType":"gp2"},"VolumesPerInstance":4}]},"InstanceGroupType":"CORE","InstanceType":"c5.xlarge","Name":"Core - 2"},{"InstanceCount":1,"EbsConfiguration":{"EbsBlockDeviceConfigs":[{"VolumeSpecification":{"SizeInGB":32,"VolumeType":"gp2"},"VolumesPerInstance":2}]},"InstanceGroupType":"MASTER","InstanceType":"m5.xlarge","Name":"Master - 1"}]' --configurations '[{"Classification":"hive-site","Properties":{"hive.metastore.client.factory.class":"com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory"}},{"Classification":"spark-hive-site","Properties":{"hive.metastore.client.factory.class":"com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory"}}]' --auto-scaling-role EMR_AutoScaling_DefaultRole --bootstrap-actions '[{"Path":"s3://sagemaker-us-east-1-************/egelberg/install_ray.sh","Name":"Ray install"}]' --ebs-root-volume-size 10 --service-role EMR_DefaultRole --enable-debugging --auto-termination-policy '{"IdleTimeout":9000}' --name 'Ray - 4x c5.xlarge' --scale-down-behavior TERMINATE_AT_TASK_COMPLETION --region us-east-1
Screenshots:
Simple ray.init() call:
Trying to specify the number of cpus after reading through this thread:
Attempting to run a script to return IP addresses across a cluster (able to run in Sagemaker):
When attempting to initialize Ray on an EMR cluster that bootstraps the init script in this repo, I'm hitting the following error: My goal is to utilize Ray in a Jupyter notebook that has an EMR cluster attached. I created a small cluster which bootstraps the init script in this repo. I've created a PySpark notebook, where I run
This then produces the following error: