memgraph / helm-charts

Helm charts for deploying Memgraph, an open-source in-memory graph database.
https://memgraph.github.io/helm-charts/
Apache License 2.0
13 stars 9 forks source link

[Bug]: MemGraph Volume Error for replicas > 1 #11

Closed ShashwatPK closed 4 months ago

ShashwatPK commented 1 year ago

Contact Details

shashwatpathak100@gmail.com

What happened?

While setting the replicas value to greater than 1, the number of pods increases from 1 . But the number of PVC remains the same, with RWO property. This causes the scaled up pods to stay in containerCreating state with the following Events

Events:
  Type     Reason              Age                  From                     Message
  ----     ------              ----                 ----                     -------
  Normal   Scheduled           9m52s                default-scheduler        Successfully assigned default/memgraph-1 to ip-10-182-45-106.ap-south-1.compute.internal
  Warning  FailedAttachVolume  9m52s                attachdetach-controller  Multi-Attach error for volume "pvc-e48cea1d-c7f7-4a02-8e1b-289d27742ec4" Volume is already used by pod(s) memgraph-0
  Warning  FailedAttachVolume  9m52s                attachdetach-controller  Multi-Attach error for volume "pvc-b81c3549-0ec9-4a33-b16f-bcdcef4e82d8" Volume is already used by pod(s) memgraph-0
  Warning  FailedMount         3m17s                kubelet                  Unable to attach or mount volumes: unmounted volumes=[memgraph-log-storage memgraph-lib-storage], unattached volumes=[memgraph-log-storage kube-api-access-ql8j4 memgraph-lib-storage]: timed out waiting for the condition
  Warning  FailedMount         63s (x3 over 7m49s)  kubelet                  Unable to attach or mount volumes: unmounted volumes=[memgraph-lib-storage memgraph-log-storage], unattached volumes=[memgraph-lib-storage memgraph-log-storage kube-api-access-ql8j4]: timed out waiting for the condition

What is the solution here?

Chart type

Standalone

Chart version

0.1.0

Environment

Amazon Web Services

Relevant log output

No response

antejavor commented 1 year ago

Hi @ShashwatPK , thanks for reporting this. This issue will be fixed in the upcoming release. I have added the issue to the next milestone.

Givemeurcookies commented 6 months ago

Any ETA on this? I see there's already a fix that's been stale for a while due to needing a review.

katarinasupe commented 6 months ago

Hi @Givemeurcookies, there haven't been a lot of users struggling with this, and that's the reason why we haven't put a lot of time into this. Ideally, we would merge the PR asap, but I remember something wasn't working for us on the initial test, so we concluded that more effort would have to be put into the PR than expected. Is this a blocker for you and your project? Are you using Memgraph helm chart?

Givemeurcookies commented 6 months ago

@katarinasupe Yes, we're using the Memgraph helm chart. It's not a blocker currently as we're only checking out Memgraph for a pilot, but we will eventually require a production ready HA environment if we are to use Memgraph. Being able to test a multi-instance Memgraph setup could allow us to iron out any issues before it becomes critical.

Do you remember what the issues was? If we get some spare time in the future, we might have a crack at it if it's not fixed.

antejavor commented 6 months ago

Hi @Givemeurcookies, thanks for the feedback, initially, we had some issues with starting this correctly, I guess the first version didn't work, so it was left hanging.

But the real problem is we envisioned a replication helm chart (supporting multiple replicas by default), this being just a single Memgraph pod. The replication chart means replication is fully set out of the box with proper ingress service for write/read workloads.

Hence, solving this issue didn't look like a priority for multiple replicas because of the setup you need to do manually for HA and general priorities, as @katarinasupe said. But I agree this makes the chart much more adaptable.

We are also changing/improving the HA a lot in the last few months, so it didn't make sense to create the chart before, since changes are just coming.

Just a context, we are hosting this engineering driven office hours, it would be cool to find out what are your requirements for HA and Kubernetes, if you would like we could jump on short call and discuss this.