Clarification on DuckDB and multiple instances

lalabuy948 / PhoenixAnalytics

📊 Plug and play analytics for Phoenix applications.

https://theindiestack.com/analytics

Apache License 2.0

266 stars 10 forks source link

Clarification on DuckDB and multiple instances #9

Closed phihos closed 1 month ago

phihos commented 2 months ago

Hi,

your project looks very interesting and I wonder if I can use it in my Kubernetes-based multi instance setup. But after reading the docs I am a bit confused. Quoting from the README:

For whom this library

  [X] Single instance Phoenix app
  [X] Multiple instances of Phoenix app without auto scaling group
  [ ] Multiple instances of Phoenix app with auto scaling group

There is a plan to build a separate backend to be powered by ClickHouse in order to track requests across multiple nodes in orchestrated scenarios.

My naive approach would be to create a shared network volume, attach it to all of my app containers and save the DuckDB file in it. If it is comparable to SQlite then a lock should protect concurrent writers to break the database. With that approach even auto-scaling instances should work. What am I missing here?

lalabuy948 commented 2 months ago

Hi @phihos,

Up for now your scenario is not supported, as if you provide same file to all instances my current implementation would write X multiplied by amount of nodes. As DuckDB doesn’t support concurrent writes, it optimised for big single writes.

What is supported at the moment if you have X amount of servers and load balancer on top of them, without dynamic/autoscaling group.

For your scenario I have some discoveries noted here. For auto scaling ideally you would need shared file, true, that’s what I will try to achieve with S3/parquet.

Hope to get some free time soon and cover your scenario.

phihos commented 2 months ago

Thank you very much for the explanation. Currently I do have a fixed replica count so my setup should be covered right now. How do I configure this correctly? Does every instance have its own DuckDB file?

lalabuy948 commented 2 months ago

Fixed replica count is supported, each instance going to have own duckdb file. Just provide duckdb path in configuration and I guess add volume in your containers for that path.

Let me know if you will have any difficulties!

lalabuy948 commented 1 month ago

Hi @phihos , I added support for postgres backend in 0.2.0 release, let me know if you will face any difficulties. Up for now I'm going to close this issue.