nicoburns / blessed-rs

A community guide to the Rust ecosystem
https://blessed.rs
1.33k stars 77 forks source link

Fix diesel entry + add diesel-async #58

Closed weiznich closed 1 year ago

weiznich commented 1 year ago

This PR fixes several serve issues with the diesel entry. I consider the old entry as factually wrong and misleading, especially the unclear language around performance compared to other database entries. For tokio-postgres the wording of performance compared to sqlx is much more concrete than the diesel version, even if this sentence is likely based on the same numbers. Also the entry sounds like being sync-only is a large problem, which to repeat me again is not true (or at least not more an issue than being async only!) Please correct this factual wrong statements as soon as possible.

In addition it adds diesel-oci as potential option for connecting to oracle databases. Finally it removes the Recommend attribute for Sea-ORM as given the limited extensibility, much worser performance and pre 1.0 state of the crate seems to be misleading. (Please do not understand that as critic at Sea-ORM. It's just that both diesel and sea-orm are designed for different use cases and you cannot recommend one or the other without knowing much about the exact use-case.

Fix #56 Fix #57

djc commented 1 year ago

Given the focus on performance in your PR, do you have references to benchmarks to back these claims up? I don't think version numbers pre-1.0 are a very useful indicator of quality in the crates.io ecosystem, so I'm not sure that's a very good reason.

Personally it seems sane to me to recommend async libraries for things that are I/O-bound, which seems like it definitely applies to database connections. Do you disagree, and if so, why?

weiznich commented 1 year ago

Given the focus on performance in your PR, do you have references to benchmarks to back these claims up?

Performance numbers: https://github.com/diesel-rs/metrics/ For some of the benchmarks the difference between sea-orm/sqlx and diesel is roughly factor 10, which is not something I would consider as insignificant. You can find the benchmarks generating these numbers here: https://github.com/diesel-rs/diesel/tree/master/diesel_bench I think the techempower benchmarks might contain some relevant numbers as well, but they are harder to judge as implementations there differ by more than the used database connection/ORM implementation.

I don't think version numbers pre-1.0 are a very useful indicator of quality in the crates.io ecosystem, so I'm not sure that's a very good reason.

I agree that normally version numbers are not a useful indicator. In this specific case it's different in my opinion. Diesel does provide long standing stability guarantees, while sea-orm releases new releases quite frequently new semver incompatible versions containing real breaking changes. (Checkout the release history of both projects, for details). I think that's at least something that should be mentioned.

Personally it seems sane to me to recommend async libraries for things that are I/O-bound, which seems like it definitely applies to database connections. Do you disagree, and if so, why?

Database connections are not I/O bound. Assuming we talk about web servers with thousands of requests here the bottleneck won't be the communication with the database, but the number of available database connections. There are normally only a few tens database connections available. You must assume that each connection can only handle one request at the same time. So most of the requests will wait on getting a connection at all, not on some database operation finishing. For this limited number of connection its totally fine to just execute these queries in a thread pool. It's much more important to use an async connection pool. You likely will not be able to measure a difference in performance there. In addition some database systems like sqlite do not even perform I/O in some cases. If you disagree with this statement: Please provide some numbers. (I've asked that question quite often, but so far noone did show up with database specific numbers showing some significant advantage of async connections).

That all written: There are valid reasons prefer sync code over async code as well. Using an async implementation normally pulls in quite large dependencies, which might be problematic depending on the use case. In addition it can introduce additional overhead due to the handling of the async code. Finally some people do consider the async ecosystem as not stable yet or just prefer to write sync code as this simplifies things in some cases. Remember: Not everything is a some sort of server handling a large number of requests. You can also use a database library in a command line tool or in similar applications. There it's helpful to just block on some database call and wait for the results. (That should not be read as this page shouldn't mentioned the fact that something "only" provides a sync interface. I think it would be useful to include this kind of information for both sync and async only crates.

nicoburns commented 1 year ago

I've merged #62 which built upon this PR with a few tweaks.

weiznich commented 1 year ago

Thanks :+1: