Open GemmaTuron opened 1 month ago
Hi @GemmaTuron , I need the following roles/policies for ec2 to able to configure SSH: ec2-instance-connect:SendSSHPublicKey ec2:DescribeInstances
{ "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Action": "ec2-instance-connect:SendSSHPublicKey", "Resource": "arn:aws:ec2:region:account-id:instance/*", "Condition": { "StringEquals": { "aws:ResourceTag/tag-key": "tag-value" } } }, { "Effect": "Allow", "Action": "ec2:DescribeInstances", "Resource": "*" } ] }
Hey @sucksido
I get this error:
The service ec2 does not support specifying a Region in the resource ARN
There seems to an issue with the region, we can try this: { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "ec2-instance-connect:SendSSHPublicKey", "Resource": "arn:aws:ec2:instance/*", "Condition": { "StringEquals": { "aws:ResourceTag/tag-key": "tag-value" } } }, { "Effect": "Allow", "Action": "ec2:DescribeInstances", "Resource": "*" } ] }
I have successfully launched an EC2 instance to run ZairaChem. I am currently working on configurations and dependency installations, and will begin testing after all is working.
Should I still try the above permissions?
Hi @sucksido When you can please post an update on the status of this
Update: Zairachem has been set up on an EC2 instance, I will share the log in details and the instructions privately, Now I am busy training models and encountred a Meta data issue which Jason previously raised, I am going to fix this manually today and continue testing
Successfully trained model on Zairachem on our AWS ECS instance. I ran the following commands:
`conda activate zairachem
cd zaira-chem
zairachem fit -i /home/ec2-user/amr_small_train.csv -c 0.1 -d low -m /home/ec2-user/zairachem_models
zairachem predict -i /home/ec2-user/amr_small_test.csv -m /home/ec2-user/zairachem_models -o /home/ec2-user/zairachem_test_output`
To log into the EC2 instance:
For Linux Instances:
ssh -i /path/to/your-key-pair.pem ec2-user@your-instance-public-dns
Replace /path/to/your-key-pair.pem
with the path to your key pair file and your-instance-public-dns
with the public DNS of your instance.
@JHlozek please see above comments, I have managed to train the models and run predictions, I have shared the log ins with you privately to test. Please let me know how it goes.
Let's see how much a fit command takes with 2000 mols - also good to know the space needed for different model sizes - how do we scale the container size automatically? @JHlozek please pass to @sucksido some datasets for testing
Hi @sucksido, here are two expanded train/test sets from the same original Novartis_3D7 set:
Thanks @JHlozek , I will train these and give feedback
The training of models is still running since ~12:00 mid day today, when it's done I will run the predictions command
Hi @sucksido do you have the logs of the run? I find it surprising that it takes so long to train a model on 2000 molecules
Hi @GemmaTuron
Fit command ran from : 12:05 to 17:50 Predict command ran from: 20:30 - 00:51
This is all for 2000 molecules
mm trying to understand the costing. @JHlozek or @sucksido did you work on the EC2 Instance early in the week? From Monday to Wednesday: 4 USD Thursday: 0.7 USD Does this mean 1 model costs around 1USD?
@GemmaTuron i didin't do much work on it early this week but it's always running, we don't switch it off
But in principle, it should have a spike yesterday because you were using it? I am trying to understand baseline cost of having it on vs using it Also, as we do not train models that often, we need to have a way of switching this on and off - how can we go about it?
Agreed, we can simply turn off the instance when we are not training models and only switch it on when we need it. happy to do that.
EC2 instance to train ZairaChem models in the cloud and save resources & avoid loadshedding :)