pangeo-data / jupyter-earth

Jupyter meets the Earth: combining research use cases in geosciences with technical developments within the Jupyter and Pangeo ecosystems.
https://jupytearth.org
Creative Commons Zero v1.0 Universal
28 stars 6 forks source link

Provide high memory server and expose its local SSD storage #88

Open consideRatio opened 2 years ago

consideRatio commented 2 years ago

@espg asked for instances with even higher memory than n1.16xlarge, and also if possible, to access a high performant storage. I'll explore the ability to arrange both.

Technical steps

Discussion

espg commented 2 years ago

@consideRatio I'm fine with a block storage path-- something like /dev/nvmep1, or whatever the block device displays 'natively' and needs the least amount of configuration. Using /tmp may not be as good a solution... other processes write to /tmp, and I don't mind coding the extra path if it keeps it clean and only has files that are explicitly put there.

btw, what is our base image type? You mention n1.16xlarge, but I don't see that as a type-- are we using m5n/m5dn or m-something, or other? Does our current base unit has another memory tier?

consideRatio commented 2 years ago

We are running the node x1.16xlarge when choosing a 64 CPU high memory node, and m5.16xlarge when choosing a normal 64 CPU node, and then we have either m5 or m5a nodes as worker nodes with 4, 16, or 64 CPU.

How to test disk performance

Thanks @espg for this command!

$ cd /tmp
$ dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=dsync

1+0 records in
1+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 8.14749 s, 132 MB/s 
espg commented 2 years ago

@consideRatio @fperez pinging this issue after our ray discussion last week-- this is related to #92 which has the error message for the block storage. While block storage isn't technically required to run ray, it is needed if we want to do anything with shared memory via redis.

espg commented 2 years ago

...apparently another option for this that doesn't involve setting up block storage is spinning up a separate redis instance and connecting to it-- https://www.anyscale.com/blog/redis-in-ray-past-and-future