Closed sbesson closed 2 years ago
I think it may be a good idea to start with option 3 and deploy it in a new instance. This will give us the opportunity to deploy, configure, and maybe reconfigure the instance and the apps without affecting anything else.
@khaledk2 coming back to this, a few outstanding questions:
omeroreadwrite
or 8 CPUS/ 32GB RAM like omeroreadonly
?@khaledk2 coming back to this, a few outstanding questions:
- based on your latest investigation of indexing, what would you recommend for the compute capacity of a standalone searchengine VM? 16VCPUs/64GB RAM like
omeroreadwrite
or 8 CPUS/ 32GB RAM likeomeroreadonly
?
It would be good to have a VM like pilot-idr0000-omeroreadwrite (16VCPUs/64GB RAM).
- what should be the typical size of the underlying data volume? And should this volume follow the same snapshotting/cloning lifecyle as the DB/binary repository/nginx cache?
A data volume of 50 to 100 GB should be fine (preferably SSD). Yes, I think this should be fine. I will test getting the elastic search indices from a disk copy,
- are we happy recreating an idr-testing deployment from scratch with the initial set of choices? @will-moore
Yes, I think it should be fine.
The deployment of a first version of the search engine stack is a target of the upcoming
prod107
release. #359 introduces the playbook allowing to deploy the stack while #367 contains the logic to define a new group of servers where the service should be deployed.While #367 focuses on deploying the service in the simple context of a pilot VM where all the services are colocated on a single nde, for the scope of
prod107
we will need to decide how to deploy it the multi-nodes architecture used for production deployments.The current set of instances (and their relationship) created for each deployment can be loosely summarized by:
idr-database
->idr-omeroreadonly-1,idr-omeroreadonly-2,idr-omeroreadonly-3,idr-omeroreadonly-4, idr-omeroreadwrite
->idr-proxy
idr-database,idr-omeroreadonly-1,idr-omeroreadonly-2,idr-omeroreadonly-3,idr-omeroreadonly-4, idr-omeroreadwrite
->idr-management
Listing the various architectures available
Option 1: deploy the app in the
management
instancePros: it benefits from the Docker prerequisites being installed in the
management
instance (Currently used for monitoring), it's the strategy originally used for #359, currently deployed ontest104
Cons: the compute capacity of themanagement
VM is limited esp. for a full indexing, consuming the search endpoint from theomero
nodes would require to go through the proxyOption 2: deploy the app in the
omeroreadwrite
instancePros: this is a more scaled version of the option 1, it uses the fact the
omeroreadwrite
server has a larger compute capacity. Additionally, this makes use of the capacity ofomeroreadwrite
which is currently unused when the deployment is moving to production (except for minor DB updates like adding DOIs/publication) Cons: same as above, consuming the search endpoints from theomeroreadonly
nodes currently requires to go through the proxy without additional nginx configurationOption 3: deploy the app in a new
searchengine
instancePros: allows to tailor compute/storage capacity of the instance to the exact needs of the app. Allows the various
omero
instances to access thesearchengine
service in the same way as thedatabase
is accessed Cons: requires 1 more instance would be created per production deployment and probably needs to be reviewed with the global tenancy capacityOption 4: deploy the app across all
omero
instancesPros: for indexing, this would keep the benefit of option 2 and use the compute capacity of
omeroreadwrite
, if we are thinking of integrating with omero-web or idr-gallery, it colocates the service and simplifies. Also this starts scaling the service in the same way as the OMERO.web servers Cons: probably requires additional thoughts on how to distribute the data especially the elasticsearch database, possible moving towards an ElasticSearch cluster (4a) or moving ElasticSearch to yet another instance (option 4b)