ude-soco / CourseMapper-webserver

A collaborative course annotation and analytics platform
https://coursemapper.de
MIT License
1 stars 0 forks source link

Live infrastructure resource scaling #1239

Open rawaa123 opened 1 week ago

rawaa123 commented 1 week ago

We have this error in argo in coursemapper-kg-concept-map.

Back-off restarting failed container worker in pod sc-coursemapper-webserver-coursemapper-kg-concept-map-86d9nx6m6_cmw-prod(6cfe6c2e-c97c-47c0-bb74-c31914511385)

2- Also, I am getting these errors, Is it normal?

image

ralf-berger commented 1 week ago

I'm not seeing a lot of restarts, was this a temporary thing? Might be memory issues, the server is pretty much at the limit.

rawaa123 commented 1 week ago

We are not able to generate the KG. We have the following error in both edge and live. could you have suggestions

image

ralf-berger commented 1 week ago

I think the actual meaning of the error message might be: "Failed to query xyz from DBpedia endpoint, HTTP status 502"

rawaa123 commented 1 week ago

Regarding memory issues, if we want to upgrade the resources, what would be the required capacity for smooth operation, and how much would the upgrade cost?

ralf-berger commented 6 days ago

if we want to upgrade the resources, what would be the required capacity for smooth operation, and how much would the upgrade cost?

The current single VM instance on a shared host is 32,40 €/month (+ storage, IP, …) and provides 32 GB of RAM (which are pretty much used up constantly). I'd usually scale the machine up to the next size, but unfortunately this is already the limit for shared-CPU instances. The next available instance size would be 64 GB with dedicated vCPUs, which is 95,99 €/mo.

At this price point it might make more sense to either invest in a dedicated server (> 50 €/mo, longer contract runtime, no scaling at all, including storage), set up a second single-machine cluster (complicated management, would require applications to be balanced across environments manually) or scale the cluster up to multiple instances (would require upfront work to set up networking, load balancing, etc. and result in slower storage due to required distributed storage system).

rawaa123 commented 3 days ago

Thank you for the options. What is your suggestion, which option is better based on the current state and growing need for resources as CourseMapper keep on scaling up? In the next release we might need to host another instance of neo4j for features related to MOOCentral project combined in CourseMapper.

ralf-berger commented 1 day ago

What's the timeframe for adding additional services? What about the other apps (elas, rima, openlap, …), will there be any changes in the foreseeable future? Is there a limited budget? Are you trying to introduce redundancy (which would require stateless services), or would it be okay to run everything on a single physical host machine that might fail?

rawaa123 commented 1 day ago

@ralf-berger

What about the other apps (elas, rima, openlap, …), will there be any changes in the foreseeable future?

We will have updates in all projects but CourseMapper will have the intensive ones.

Is there a limited budget?

No we dont have limitation to some extent.

invest in a dedicated server (> 50 €/mo, longer contract runtime, no scaling at all, including storage),

in the previous message you have mentioned to have a dedicated server! but no scalling at all. What dose this mean please?

Currently, we are not able to access neither Argo nor Edge or live and I think this is because of the lack of the resoucres? if this is the case then we need to immediately proceed with upgrading the resources.

image

image