IHTSDO / snowstorm

Scalable SNOMED CT Terminology Server using Elasticsearch
Other
204 stars 80 forks source link

HOW TO RUN SNOWSTORM IN ECS CLUSTER? #436

Open Nareshsam95 opened 2 years ago

Nareshsam95 commented 2 years ago

can anyknow tell me how to run this snowstorm in ECS cluster? I have tried in many but my containers are not getting started in ECS cluster

Thanks in advance.

kaicode commented 2 years ago

Which approach are you trying, Fargate or EC2 instances? An EC2 instance may be easier to debug. Could try starting one manually and connecting to check the logs? Ensure there is at least 8G of memory on the container.

Community input welcome on this one!

Nareshsam95 commented 2 years ago

i'm running in FARGATE

kaicode commented 2 years ago

I would start by getting Elasticsearch 7.x running in Fargate. There are a few examples when I search, but they use Elasticsearch 8.x which is not compatible with Snowstorm.

Once you have Elasticsearch 7.x running you would need to make sure Snowstorm starts after the Elasticsearch API is up (usually on http://localhost:9200 ) otherwise Snowstorm startup will fail.

If you have any specific issues feel free to post the logs here and we will attempt to help you debug.

tia-schung commented 2 years ago

We're trying to figure this out right now too. One big learning: Amazon adopted OpenSearch, but the OpenSearch service can still spin up an Elasticsearch cluster for you. Under Deployment type, enable "include older versions" and you'll be able to choose ES 7.x

Screen Shot 2022-08-31 at 11 42 34 AM
Nareshsam95 commented 2 years ago

I would start by getting Elasticsearch 7.x running in Fargate. There are a few examples when I search, but they use Elasticsearch 8.x which is not compatible with Snowstorm.

Once you have Elasticsearch 7.x running you would need to make sure Snowstorm starts after the Elasticsearch API is up (usually on http://localhost:9200 ) otherwise Snowstorm startup will fail.

If you have any specific issues feel free to post the logs here and we will attempt to help you debug.

i have tried to run the elasticsearch in my ecs fargate cluster but elasticsearch container is get into exited state. The elasticsearch container is not running.

kaicode commented 2 years ago

Are you able to capture any logs from Elasticsearch? It may be a disk space issue which could be fixed by configuration changes.

Nareshsam95 commented 2 years ago

yes i have seen the logs. vm issues came but i dont know how to add memory in Task Definitions

tia-schung commented 1 year ago

We didn't want to deal with running Elasticsearch ourselves, so we are running snowstorm in an ECS container talking to an AWS Opensearch instance running Elasticsearch 7.7. Here's the Dockerfile we're using:

FROM snomedinternational/snowstorm:7.9.3

USER snowstorm

ARG PORT=${Port}

EXPOSE $PORT

ENTRYPOINT ["java","-Xms2g","-Xmx3g","-jar","snowstorm.jar","--elasticsearch.urls=${OPENSEARCH_URL}", \
    "--elasticsearch.username=snowstorm", \
    "--elasticsearch.password=${OPENSEARCH_PASSWORD}" \
    ]

Snowstorm docs say you need at least 2g memory, but we had to bump it up to 3g to load data into elasticsearch. That's why max heap size is 3g instead of 4g.

Once that's running, you can load the data into elasticsearch by putting the snomed data files in s3, opening a shell on the container with aws ecs execute-command --region <your-region> --cluster <your-cluster> --task <task-id> --container <your-container> --command "/bin/sh" --interactive and running wget <s3 url> to copy the file onto the container. Then while still in the shell you can run all the curl commands in snowstorm documentation.

matiasict commented 10 months ago

We didn't want to deal with running Elasticsearch ourselves, so we are running snowstorm in an ECS container talking to an AWS Opensearch instance running Elasticsearch 7.7. Here's the Dockerfile we're using:

FROM snomedinternational/snowstorm:7.9.3

USER snowstorm

ARG PORT=${Port}

EXPOSE $PORT

ENTRYPOINT ["java","-Xms2g","-Xmx3g","-jar","snowstorm.jar","--elasticsearch.urls=${OPENSEARCH_URL}", \
    "--elasticsearch.username=snowstorm", \
    "--elasticsearch.password=${OPENSEARCH_PASSWORD}" \
    ]

Snowstorm docs say you need at least 2g memory, but we had to bump it up to 3g to load data into elasticsearch. That's why max heap size is 3g instead of 4g.

Once that's running, you can load the data into elasticsearch by putting the snomed data files in s3, opening a shell on the container with aws ecs execute-command --region <your-region> --cluster <your-cluster> --task <task-id> --container <your-container> --command "/bin/sh" --interactive and running wget <s3 url> to copy the file onto the container. Then while still in the shell you can run all the curl commands in snowstorm documentation.

what type of elasticsearch instance are you using?