citusdata / citus

Distributed PostgreSQL as an extension
https://www.citusdata.com
GNU Affero General Public License v3.0
10.39k stars 661 forks source link

Transition to citusdata #598

Closed rimusz closed 8 years ago

rimusz commented 8 years ago

hey there,

How would be easy to transition from postgres to citusdata e.g. for the https://www.odoo.com ?

mtuncer commented 8 years ago

Hey @rimusz

Citus is not an entirely different product, it is an extension running on top of existing postgresql.

Since you are already postgresql user, transtioning would be much smoother. Deciding on how to use Citus mostly depends on what type of queries you want to run. Then you decide on a column to distribute data on, and distribution method (hash, append).

if you could drop an email to our public forum (edited) with this information, we would be happy to assist with you on your transitioning.

ozgune commented 8 years ago

@rimusz -- I just wanted to add another note here.

We think of using Citus across two dimensions: (a) your workload -- are you looking to create analytical dashboards, power your website's shopping cart application, etc., and (b) how are you planning to scale you database -- what is the data model you had in mind and what tooling were you looking to use? Do you have a single table or dozens of tables?

Could you provide us a bit more info across these dimensions?

@begriffs from our team is working on adding more clarification to our documentation on this. We're also tracking his work in https://github.com/citusdata/citus_docs/issues/34

rimusz commented 8 years ago

@mtuncer @ozgune I'm just wondering is the citusdata can be an easy swap from postgres for not custom made Apps e.g. https://www.odoo.com?

Docs are not much clear on that. Update every table or just the database to be the distributed one?

Thanks

ozgune commented 8 years ago

@rimusz -- you'll probably need to think about your data model and the tables you'd like to distribute. Could you elaborate a bit more on your use-case? For example, are you looking to serve a website or run real-time analytics? How many tables do you have in your database (large / small)?

We're making improvements to our website, and I'm copy/pasting the related section. Does this help at all?


When to use Citus

Citus horizontally scales PostgreSQL across multiple machines using sharding and replication. It works well when you have a large data set and when you want to get answers from that data in human real-time – typically in less than a second.

Example use cases:

For concrete examples check out our customer use cases. Typical Citus workloads are operational, with aggregate queries and no long-lived transactions.

Considerations for Use

Although Citus extends PostgreSQL with distributed functionality, it is not a drop-in replacement that scales out all workloads. A performant Citus cluster involves thinking about the data model, tooling, and choice of SQL features used.

Data models that have fewer tables (<10) work much better than those that have hundreds of tables. This is a property of distributed systems: the more tables, the more distributed dependencies. Still, compared with NoSQL databases Citus does not require aggressive denormalization.

Citus supports most PostgreSQL tools. Still users may need to take additional steps when using tools that require distributed execution. For example, pg_dump and pg_restore currently don’t take distributed backups, but there are scripted ways to make these tools work on individual PostgreSQL nodes.

PostgreSQL provides thousands of features, and Citus doesn’t yet scale them all. A good way to think about feature coverage is the following: if your workload aligns with use-cases noted in the “When to use Citus” section and you happen to run into an unsupported query, then there’s usually a good workaround.

When Citus is Inappropriate

Workloads which require a large (non-aggregated) flow of information between nodes generally do not work as well. For instance:

These constraints come from the fact that we operate across many nodes (as compared to a single node database), giving you easy horizontal scaling as well as high availability.

rimusz commented 8 years ago

@ozgune what I'm looking for how would be easy to migrate the off shelf Apps e.g. odoo.com is a very good example of the app using postgres without having to change it's code. As each new update usually brings the code changes, some database schema changes and every time time altering db/tables is not really ideal.

ozgune commented 8 years ago

@rimusz -- I don't know enough about odoo.com to conclusively answer your question. My guess is that you'd currently need to make some changes to the underlying data model, particularly if it uses 10+ tables. That said, it's hard to say without looking deeper.

saicitus commented 8 years ago

@rimusz: I am closing the following git-hub issue. Feel free to post any questions related to how Citus could handle your use-case on the following public channels, Citus Slack Channel and Citus-users google group.

cecofreeman commented 2 years ago

Hello, I am wondering if somebody manage to succeed transferring odoo to citusdata? Any help appreciated! Thank you in advance