Open jennakwon06 opened 3 years ago
@jennakwon06 - make sure to add the following line to your autoscaler config to prevent default setup_commands from default.yaml (which may differ depending on the version of ray installed on the host running ray up) being automatically applied and trying to install a Python 3.6 Ray Wheel:
setup_commands: []
For example, I launched a cluster via ray up us-west-2-cp37-ray120-test.yaml
from the same AMI using the following autoscaler config, and verified that the final result matched my expectations:
cluster_name: us-west-2-cp37-ray120-test
max_workers: 1
provider:
type: aws
region: us-west-2
availability_zone: us-west-2a
auth:
ssh_user: ubuntu
head_node:
InstanceType: r5n.xlarge
ImageId: ami-0f92e9d2b63bc61a2
SecurityGroupIds:
- sg-07f4b3353e442a2ce
worker_nodes:
InstanceType: r5n.xlarge
ImageId: ami-0f92e9d2b63bc61a2
SecurityGroupIds:
- sg-07f4b3353e442a2ce
setup_commands: []
pdames$ ray attach us-west-2-cp37-ray120-test.yaml
ubuntu@ip-XXX-XX-XX-XXX:~$ pip show amzn-ray
Name: amzn-ray
Version: 1.2.0
Summary: Staging area for ongoing enhancements to Ray focused on improving its integration with AWS and other Amazon technologies.
Home-page: https://github.com/amzn/amazon-ray
Author: Amazon Ray Team
Author-email: amzn-ray-team@amazon.com
License: Apache 2.0
Location: /home/ubuntu/anaconda3/lib/python3.7/site-packages
Requires: numpy, jsonschema, aiohttp-cors, colorama, msgpack, redis, colorful, filelock, aiohttp, pyyaml, click, py-spy, grpcio, requests, opencensus, aioredis, prometheus-client, protobuf, gpustat
Required-by:
ubuntu@ip-XXX-XX-XX-XXX:~$ python
Python 3.7.7 (default, Mar 26 2020, 15:48:22)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
I see. Sounds good. Thanks! It sounds like this could be a documentation improvement about the behavior of empty fields. I will leave this open until we improve that documentation.
Problem
I am using ami-00f92e9d2b63bc61a2 which is supposed to be the ami for
Linux - Python 3.7 - Ray 1.2.0
.I am using below yaml file, where my docker image
048211272910.dkr.ecr.us-west-2.amazonaws.com/jkkwon-batscli:zarr
is a custom image based off of763104351884.dkr.ecr.us-west-2.amazonaws.com/tensorflow-training:2.3.1-cpu-py37-ubuntu18.04
.The problem is that running
ray up
fails with messageWhen NOT using the docker image, I am able to actually get the Ray cluster up and running. But when I log onto it with
ray attach
and look at Python console, I get below:I am wondering if Ray wheel was mis-uploaded for 3.6 version, not 3.7 version?
Thanks!