ptpb / pb

pb is a formerly-lightweight pastebin and url shortener
Other
551 stars 52 forks source link

replace sql #99

Closed buhman closed 9 years ago

buhman commented 9 years ago

Though we proved last night that all of our current procedures other than paste_get_stats have something approaching constant-time complexity, I think the complexity of our current code is rather ridiculous.


Here are the problems so far:

We did this because it made more sense than breaking indexing, or making special metadata columns that ultimately identify what kind of paste this is, then having to twiddle metadata bits in queries.

There are multiple reasons for this, mostly because: 1) SQL injection is impossible, because no SQL is ever executed from the application 2) as a result, we also get the benefit of only parsing SQL once (on schema load), which makes queries faster


Here's why we got rid of the previous ORM (sqlalchemy):

Enough said.


However, as you'll notice, we've only actually fixed the 'slow' part; clumsy is back, only in a different form. The way to fix the clumsiness (and the problem in its entirety), in my opinion, is to replace SQL entirely.

Mongo in particular fits our data model very well. First read the terminology comparison. In the first few seconds we learn:

If you're not convinced, here's how MySQL doesn't help us whatsoever:

All of these facts combined, I'd like to replace entirely our use of MySQL with MongoDB.

buhman commented 9 years ago

One thing will break as a result of doing this: base66 paste IDs will be replaced with yet-another-scheme.

buhman commented 9 years ago

http://docs.mongodb.org/manual/core/index-sparse/#sparse-index-on-a-collection-cannot-return-complete-results

jdppettit commented 9 years ago

Let's do it