SchemaPlus / schema_validations

Automatically creates validations basing on the database schema.
Other
173 stars 34 forks source link

Extending validations with JavaScript functions (Postgres only) #15

Open robababa opened 9 years ago

robababa commented 9 years ago

First, apologies for the length of this post.

My idea is to share the same set of validations at the client, application server and database server tiers of a web application. This gem does some of that, but I'd like to extend it when the database is Postgres. Given these available features:

(1) We can add the PL/V8 (JavaScript) language to the database. (2) We can write functions in PL/V8 that take input and return true or false. (3) These functions can then be used as column constraints or table constraints. (4) These functions can also be used as validations for DOMAIN datatypes.

Let's do this:

(1) Pull these validation functions from Postgres into Ruby and use them for Ruby validations with https://github.com/sstephenson/execjs. (2) Create a Rake task that pulls these database functions into a generated JavaScript .js file that the client tier can use to validate user input with the same rules. The developer would run this task during each database migration. Perhaps even insert this task into the normal migration task?

We would then have the same validation rules in all three tiers of our web applications - client, app server and database server, without any validation traffic between layers except at startup. (Each layer has its own identical copy of the validation rules, originally sourced by the database.)

We could take full advantage of the Postgres DOMAIN datatype feature, assuming our domain constraints are written in JavaScript functions. For example:

CREATE OR REPLACE FUNCTION valid_past_date(value_in date)
  RETURNS boolean
  LANGUAGE plv8
  STABLE
  AS $_$
    // nulls are OK - enforce not null constraints at the table level
    if (value_in == null) { return true; }

     // it is a "past" date if it is on or before today
    return (value_in <= Date.now());
  $_$;

CREATE DOMAIN past_date AS date
  CONSTRAINT ck_past_date CHECK (valid_past_date(VALUE));

The client tier could use valid_past_date() from the generated .js file to validate input on a "past_date" field before passing it on to the app server and database.

I am interested in writing something like this. Could it be an enhancement to this gem? Or a new gem in the SchemaPlus family? Or a separate gem entirely? Your feedback is appreciated!

ronen commented 9 years ago

@robababa

My idea is to share the same set of validations at the client, application server and database server tiers of a web application.

Very laudable goal! You're right, very much in the spirit of schema_validations, and very appropriate for modern thick-client apps.

And apologies for the length of my reply, this sparks so many thoughts! :)

I don't know if you've been following the discussions of refactoring schema_plus for v2.0 -- see SchemaPlus/schema_plus#168, SchemaPlus/schema_plus#197. I'm in the midst of the refactor, once it's done creating the above should be easier than it would be otherwise.

As per the discussions in those issues and my thoughts above, I'd want to try to do this as modularly as possible, creating independent single-purpose "feature gems", rather than one large monstrosity to try to do everything. I.e. possibly something like:

  1. Enhance schema_validations to issue JavaScript
  2. Create a schema_plus_check_constraints feature gem
  3. Create a schema_plus_pg_domains feature gem
  4. Create schema_plus_pg_javascript_constraints feature gem that builds on all the others.

Step 1 could be done any time. For steps 2-4, I'd say hold off until I finish the schema_plus 2.0 refactor at least to beta release (I'm trying to get it done as fast as possible -- my day jobs and my sleeping are suffering but I want to get it off my plate ASAP).

BTW I should say unfortunately I probably won't have time to invest in any of it myself. Once the refactor is done I'm going to have to let schema_plus take a back seat for a while and limit my activity to crucial bug fixes. Though always happy to discuss and kibbitz :)

BTW2 I'm happy to add you to the SchemaPlus organizaiton so you can work on these without needing my intervention.

BTW3 Any comments/suggestions over in that refactoring thread are always welcome!

robababa commented 9 years ago

@ronen: Thank you for your comments! The CHECK constraint feature would be very nice to have, but I think it needs some strong limits to make it workable. For example, in Postgres, say I want to create a column constraint that requires the date to have an odd-numbered day. I could do this:

create table t1 (d1 date check (mod(extract(day from d1)::integer, 2) = 1));

This constraint relies on the mod() and extract() functions, the :: casting operator, the integer built-in datatype, and = as a comparison operator instead of ==. Many (all?) of those Postgres SQL features don't exist in Ruby or JavaScript, so the constraint would have to be rewritten from the ground up.

Even worse, I could implement the same odd-numbered day constraint like this (in Postgres):

CREATE OR REPLACE FUNCTION plpgsql_odd_day(value_in date)
RETURNS boolean
LANGUAGE plpgsql
STABLE
AS $_$
  begin
    if (mod(extract(day from value_in)::integer, 2) == 1)
    then
      return true;
    else
      return false;
    end if;
  end;
$_$;

create table t2 (d2 date check (plpgsql_odd_day(d2)));

Here, my check constraint relies on a plpgsql function, so if I want to bring all possible CHECK constraints up to the Ruby (and front-end JavaScript) levels, then I have to re-implement the PL/pgsql language, as well as any other procedural languages that Postgres might implement.

There's an even worse possibility, because a CHECK constraint might call a plpgsql function that interacts with the database. It would be impossible to implement that kind of CHECK constraint in the Ruby or client tiers without visiting the database, which is what we want schemavalidations (or a new schema* gem) to prevent in the first place.

So an implementation of CHECK constraints needs to be, well, constrained in some way. That's why I had thought of JavaScript (PL/V8 and PL/coffee for coffeescript) functions as a good "fence" around this feature, although those languages can interact with the database as well, such as by using the plv8.execute() function: https://code.google.com/p/plv8js/wiki/PLV8.

Perhaps the gem should have a configuration option for a function name prefix, which identifies the functions in the database that the gem should pull into Ruby and client JavaScript. For instance, if the default prefix was "splus" (for SchemaPlus), then the gem would process this function:

CREATE OR REPLACE FUNCTION splus_some_function(value_in date)
...

but not this one:

CREATE OR REPLACE FUNCTION other_function(value_in date)
...

The user could then tell the gem which constraints should be copied into Ruby and client JavaScript, and which constraints to leave alone.

Thoughts?

ronen commented 9 years ago

For the basic CHECK constraint support I wasn't thinking of attempting to evaluate the the constraint in ruby. Just providing enhancements to ActiveRecord's migration DSL to at least be able to define, add, and remove them as arbitrary strings; and the dumper would need to dump them back out. E.g. as per the example in SchemaPlus/schema_plus#92, a gem schema_plus_check_constraints would support:

create_table :things do |t|
    t.integer :x,                  check: "x < 100"          # column constraint
    t.integer :y,                  check: { expression: "y < 100", name: "constraint_name" }
    t.check "x + y > 100"
    t.check "x + y < 150", name: "constraint_name"
 end

This gem wouldn't attempt to parse or evaluate the expressions, so wouldn't be particularly useful for validations (well, maybe1). But still, it provides a way to get constraints into the database. So for your examples, the table definition could include:

   t.datetime :d1,  check: "mod(extract(day from d1)::integer, 2) = 1"
   t.datetime :d2,  check: "plpgsql_odd_day(d2)"

With just this support of CHECK, for d2 you'd be on your own to define functions outside of rails, or if you want it in your migration it'd be by explicit calls to connection.execute with raw SQL.

[The basic support for CHECK could include some standard common expressions that it has hardwired, such as check: { less_than: 100} but that's orthogonal to this discussions.]

Next thing to consider would be basic support for defining functions in migrations, i.e. a new gem schema_plus_functions that would provide syntax such as2:

create_function :plpsql_odd_day, force: true do |f|
   f.language: 'plpgsql'
   f.returns: :boolean
   f.parameter :date, :value_in
   f.body <<-ENDFUNC
      begin
        if (mod(extract(day from value_in)::integer, 2) == 1)
        then
          return true;
        else
          return false;
        end if;
      end;
    ENDFUNC
end

which would issue the CREATE OR REPLACE FUNCTION statement, and the dumper would be able to dump it back as ruby. (And under the hood it would support methods migration.add_function, migration.remove_function, which can be called programatically.) Here too it's considering the body of the function as an arbitrary string and would make no attempt to evaluate it.

Given schema_plus_check_constraints and schema_plus_functions in place (and maybe schema_plus_domain_types), and given support in schema_validations to emit validations as JavaScript...

...then I think we'd be in a position to have a higher-level gem that would know about JavaScript and would make it easy to define functions, check constraints, and perform validations all in JavaScript, which could be evaluated in the db, ruby, and/or client.

But I would definitely think about it in terms of those separate smaller steps, rather than one gem that tries to do it all end-to-end.


1 In an email exchange, @lowjoel and I discussed catching UniqueRecordErrors and returning them as validation failures at save time, rather raising InvalidStatement errors. Possibly something similar could be done with check constraints. We havent' looked into whether/how it would work.

2 Would need to put more thought into what good DSL syntax would be... that's just an off-the-cuff example

robababa commented 9 years ago

@ronen: Thanks for the follow-up! I understand that you would first like to add check constraints to ActiveRecord schema definition and migration DSL, which would definitely be a good thing for its own gem, schema_plus_check_constraints.

It's a little bit surprising that AR doesn't do that already, but I've come to Ruby from the database side of IT, so that's where my focus is.

BTW, I'd be glad to join the SchemaPlus team and contribute what I can. It might not be much, but I'll do what I can. I'm not sure how that works on GitHub, so let me know.

ronen commented 9 years ago

It's a little bit surprising that AR doesn't do that already, but I've come to Ruby from the database side of IT, so that's where my focus is.

Yes, surprisingly AR didn't even support foreign key constraints until last month! That was the primary motivation for creating schema_plus years ago (based on earlier open source code that added foreign key constraint support to AR).

As for what to do -- until the schema_plus 2.0 refactor is done, I'd avoid trying to write any of the new gems.

But I do think that a key first step -- which could be done right now -- as per above would be to enhance schema_validations to be able to issue (hardwired) JavaScript for the builtin constraints that it currently supports. That would require figuring out issues like how best to integrate with the asset pipeline, what the JavaScript validation API would be, what sort of configuration options would be needed, etc.

If that was something you wanted to do, you could do by just forking schema_validations in your own github account and working on it. Once you've got it to some reasonable state we could then merge it back in to the SchemaPlus repo and figure out the organizational things.

jhirbour commented 9 years ago

There's been a lot action in the Opal community lately. I know it's be a pretty heavy dependency, but if you end up having to provide code that generates JS validations, Opal might be a good way to go so you can write it in Ruby.