Closed phette23 closed 1 year ago
A budget for piloting and a budget for running in production. Use Google Compute calculator. Compute, egress, storage, database.
What do we need to know?
CalTech's readme notes that they use "a m6i.xlarge AWS EC2 instance with Ubuntu 20.04". From AWS' product details this is 4 vCPU and 16 GiB RAM. Based on their docs, they're running the main app, REST API, and celery worker as services on the same machine.
Northwestern's Galter Health Sciences Library uses three nodes with these resources:
App: 2 vCPU, 8GB, 180GB HDD DB: 2 vCPU, 4GB, 60GB HDD OpenSearch: 2 vCPU, 6GB, 500 GB HDD
We would use GCP's Cloud SQL instead of the db node here. They're running all the services (nginx, UI app, REST API, celery worker, and redis cache) on the app node.
Local or GCP. GCP comes with a number of further decisions.
Reuse our ES cluster or start a new one?Invenio will deprecate ES so we have to use OpenSearch, we cannot reuse our ES cluster.See also #2 cloud storage research.