Perl-Evozon / PearlBee

116 stars 44 forks source link

Run on postgres #58

Closed cromedome closed 8 years ago

cromedome commented 8 years ago

Patched to allow PearlBee to run on Postgres.

drforr commented 8 years ago

I'm making similar changes to drforr/blogs-perl-org, but taking a different tack. First, I'd love to hear a better name for the table than 'user', or at least something that's singular and not a reserved word :) I think for the moment that it's easier to add 'quote_names: 1' to your DBIC configuration than resort to turning a database table into a configuration variable.

You can check out https://github.com/drforr/blogs-perl-org/blob/master/db_patches/create_tables_Pg.sql for how the Postgres schema looks for the new blogs.perl.org.

Also please feel free to cherry-pick some changes from Result::View::Count::* in drforr/blogs-perl-org that makes some "SUM( status = 'pending' )" changes necessary to make the app run under both Pg and mySQL. A sample change is below.

Finally, if we're going to change anytihing about authentication I'd probably simply drop the salt column entirely, and use the builtin crypt() function running SHA-512. It's plenty strong for our uses, and not compromised like MD5 has been.

diff --git a/lib/PearlBee/Model/Schema/Result/View/Count/StatusComment.pm b/lib/ index f4bbe17..b85276f 100644 --- a/lib/PearlBee/Model/Schema/Result/View/Count/StatusComment.pm +++ b/lib/PearlBee/Model/Schema/Result/View/Count/StatusComment.pm @@ -13,10 +13,10 @@ PACKAGE->result_source_instance->is_virtual(1); PACKAGE->result_source_instance->view_definition( q[ SELECT

cromedome commented 8 years ago

You can steal this function from my script:

-- Override sum() to make some of our views work as intended
-- See http://stackoverflow.com/questions/20281125/why-aggregate-functions-in-postgresql-do-not-work-with-boolean-data-type/20283487#20283487
create or replace function bool_add (bigint, boolean)
    returns bigint as
$body$
    select $1 + case $2 when true then 1 else 0 end;
$body$ language sql;

create aggregate sum(boolean) (
    sfunc=bool_add,
    stype=int8,
    initcond='0'
);

And those views will run without modification :)

Thanks for the info, I will look and get back to you!

Ovid commented 8 years ago

@drforr While I suspect this horse has already bolted, I've learned the hard way that there is a simple solution to the singular/plural table name debate, focused on practicality rather than theoretical issues: always use plural because that's much less likely to class with reserved words in the database.

@cromedome and I have also discovered on a project we're working on that quote_names is one of the highest overhead items in our code when we run Devel::NYTProf.

As an aside: Well-known database god Joe Celko argues for plural table names.

cromedome commented 8 years ago

I didn't want to use a config variable either, but given what having quoting enabled on our own project was doing to performance, it was the lesser of two evils.

If you aren't too concerned about annoying users that are upgrading, you could change the table name to pbusers, and never worry about a conflict again :)

SHA-512 or bcrypt would be plenty sufficient for passwords.

How far along into this work are you? It seems as if we have many of the same goals here. Perhaps we can combine our efforts and both be done a bit sooner?

drforr commented 8 years ago

The performance loss caused by quoting is a good thing to know. I'm well aware of Celko, but I'll have to read up on his arguments for plural table names. I've been reading stuff like $user{'jgoff'} long enough that plural names for data structures seem verboten. I'm not all that far along with the changes, and actually most of the dev work is going on with the blogs-perl-org fork and occasionally getting backported.

Ovid commented 8 years ago

@drforr In Celko's book "SQL for Smarties: Advanced SQL Programming", chapter 1, on naming conventions, he writes:

  1. The table name must be unique in the schema, and the column names must be unique within a table. SQL can handle a table and a column with the same name, but it is a good practice to name tables differently from their columns. (See items 4 and 6 in this list.)
  2. The names in SQL can consist of letters, underscores, and digits. Vendors commonly allow other printing characters, but it is a good idea to avoid using anything except letters, underscores, and digits. Special characters are not portable and will not sort the same way in different products.
  3. Standard SQL allows you to use spaces, reserved words, and special characters in a name if you enclose them in double quotation marks, but this should be avoided as much as possible.
  4. The use of collective, class, or plural names for tables helps you think of them as sets. For example, do not name a table “Employee” unless there really is only one employee; use something like “Employees” or (better) “Personnel,” for the table name.
  5. Use the same name for the same attribute everywhere in the schema. That is, do not name a column in one table “sex” and a column in another table “gender” when they refer to the same property. You should have a data dictionary that enforces this on your developers.
  6. Use singular attribute names for columns and other scalar schema objects.

In particular, note item 3: the use of the word user as a table name can cause all end of headaches (as it has for me), thus meaning that users is preferable.

Then item 4 is the meat of the issue: it's important to get developers to start thinking of data returned as being sets of data. Celko also has a book about SQL programming style, but I don't have it, so I can't cite it.