Eco-Flow / Eco-Flow.github.io

Eco-Flow webpage
https://eco-flow.github.io/
4 stars 2 forks source link

Set up Cloud computing #10

Closed SimonDMurray closed 8 months ago

SimonDMurray commented 9 months ago

Discuss with Seirian and Chris how we pay for cloud computing

S3 Pricing: S3 Standard - this is the class we want as we want frequently accessed data (not archival) and we know the access patterns and size so this is the optimal pricing. First 50TB is $0.024per GB per month.

These are the details I believe we need: location type - local (we do not need the 5G embedded compute as the latency difference will not effect us) local zone - any of the european ones are sufficient (helsinki, copenhagen, warsaw, hamburg) services - Amazon EC2 - provides instance choice including intel,amd64 and arm64 instances. It is for general purpose computing and HPC applications. You can mount S3 buckets. I recommend using AWS Batch to deploy inside our compute environment on our instances that we select. Similar to a HPC but where instances replace compute nodes.

Amazon compute environment costing: If we choose a instance to just have linux and we pick a t3.xlarge (4 vCPUs and 16GiB memory) then 1 instance costs: on-demand (this model pays for compute by the second and can change it at will with no contracts, most flexible but most expensive) - $170.31 per month compute savings plan (a saving plan for a contract for a certain amount of time: 1 or 3 years, you can change region and instance at will as well as compute environment) - $98.55 per month (3 years) - $139.43 per month (1 year) instance saving plan (a saving plan where you commit to a family of instances and a region) - $73.58 per month (3 years) - $107.31 per month (1 year)

This price is for 1 instance.

I think we go for a 1 or 3 year instance saving plan. If we pick the t3 instance family then we can instances with up to 8 vCPUs and 32 GiB RAM. We can decide how many instances we want in advance (8 instances is. $588.67 per month, 16 is $1177.34 per month) and then we can create our own virtual cluster using AWS batch with more compute than Myriad.

Links: Amazon S3 pricing: https://aws.amazon.com/s3/pricing

Amazon EC2 page: https://aws.amazon.com/ec2/

Mounting S3 to EC2: https://aws.amazon.com/blogs/storage/mounting-amazon-s3-to-an-amazon-ec2-instance-using-a-private-connection-to-s3-file-gateway/

AWS batch: https://docs.aws.amazon.com/batch/latest/userguide/what-is-batch.html

Instance types: https://aws.amazon.com/ec2/instance-types/

SimonDMurray commented 9 months ago

Notes from meeting with AWS : Met with John - works on business development team, looking after education institutes, acts as first point of contact, understand what we do and put us in contact with the right people, looks after UK.

AWS has arrangment with UCL to give additional discounts (18%-22%): Set up account and set up to main UCL organisation we will get an additional discount Lots of cost control mechanics: spot instances where you dont mind jobs being interrupted can reduce cost ensuring the instances are the right size will also reduce price spot instance work by you setting a price and if the instance cost goes above the price the instance gets taken back - dont run processes that you arent ok with being interrupted (don't use for testing)

SimonDMurray commented 9 months ago

$588.67 x 36 = $21,192.12 (total for 3 years of 8 x 4CPU+16GiB RAM) $21,192.12 x 0.78 = $16,529.85 $16,529.85 = £12,959.35 (conversion rate as of 12/01/2024)

SimonDMurray commented 9 months ago

if we had 1TB of storage on S3 for 3 years then: $0.023 per GB x 1000 = $23 $23 x 36 = $828 = £649.15

SimonDMurray commented 9 months ago

could be useful: https://github.com/nf-core/awsmegatests/blob/master/.github/workflows/awstest.yml

SimonDMurray commented 9 months ago

Follow up meeting to set up AWS test instance: Chris Dye is a solutions architect who acts as a direct line to UCL: https://docs.opendata.aws/genomics-workflows/orchestration/nextflow/nextflow-overview.html

Chris Dye will walk us through the theory and how to set up our instances on the 5th of Feb and then walk us through immersion day stuff

Amazon account ID: 975050293702

SimonDMurray commented 8 months ago

training for PoC: https://catalog.us-east-1.prod.workshops.aws/workshops/8213ad51-878f-493b-8e5a-fbea22c4360c/en-US

SimonDMurray commented 8 months ago

service control policies: https://aws.amazon.com/blogs/industries/best-practices-for-aws-organizations-service-control-policies-in-a-multi-account-environment/

SimonDMurray commented 8 months ago

Initial cloud is set up with default instances, an AMI for reproducing that instance, queues set up and configured, bucket for storing work directory info and all appropriate iam roles set up

SimonDMurray commented 8 months ago

guide for aws and nextflow: https://staphb.org/resources/2020-04-29-nextflow_batch.html