SchemaPlus / schema_validations

Automatically creates validations basing on the database schema.
Other
174 stars 34 forks source link

Case insensitive uniqueness validation does not work for MySQL #38

Open dmeranda opened 8 years ago

dmeranda commented 8 years ago

I'm not sure if this issue belongs here or in schema_plus_columns, or both.

Using schema_validations 2.1.1, Rails 4.2.6, and with MySQL 5.7.

Unlike PostgreSQL, in MySQL it is the columns themselves that determine if they are case-sensitive or insensitive, and is not a property of any index. Fortunately the standard Rails mysql connection adapter already defines a case_sensitive? method on the AR Column class which works correctly. However the auto-generated validation methods by this gem seem to treat all columns as being case-sensitive.

Say you have a MySQL table defined like the following: I'm showing the raw MySQL schema so you can see the COLLATE qualifiers; where field1 is case-insensitive (*_ci) and field2 is case-sensitive (*_bin):

CREATE TABLE `examples` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `field1` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
  `field2` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `index_field1` (`field1`),
  UNIQUE KEY `index_field2` (`field2`)
) ...

Now in rails you can introspect the properties of these two columns. Notice that case_sensitive? returns the expected value — (I don't know what this doesn't conflict with schema_plus_column's similarly named method, which would fail because MySQL's Index class doesn't support case_sensitive?)

irb> Example.columns_hash['field1'].case_sensitive?
=> false
irb> Example.columns_hash['field1'].unique?
=> true
irb> Example.columns_hash['field2'].case_sensitive?
=> true
irb> Example.columns_hash['field2'].unique?
=> true

But now if you create a couple records only differing in case the validation checks don't catch the case-insensitive match and pass the insert on to MySQL, which raises a uniqueness constraint violation.

irb> a=Example.new field1: 'hello', field2: 'world'
=> #<Example id: nil, field1: "hello", field2: "world">
irb> a.save

irb> b=Example.new field1: 'HELLO', field2: 'WORLD'
=> #<Example id: nil, field1: "HELLO", field2: "WORLD">
irb> b.valid?
=> true
irb> b.save
   (0.5ms)  BEGIN
  Example Exists (1.0ms)  SELECT  1 AS one FROM `examples` WHERE `examples`.`field1` = BINARY 'HELLO' LIMIT 1
  Example Exists (1.0ms)  SELECT  1 AS one FROM `examples` WHERE `examples`.`field2` = 'WORLD' LIMIT 1
  SQL (4.1ms)  INSERT INTO `examples` (`field1`, `field2`) VALUES ('HELLO', 'WORLD')
   (47.7ms)  ROLLBACK
ActiveRecord::RecordNotUnique: Mysql2::Error: Duplicate entry 'HELLO' for key 'index_field1': INSERT INTO `examples` (`field1`, `field2`) VALUES ('HELLO', 'WORLD')
...

The debug output of the auto-generated validators is:

[schema_validations] Example.validates_length_of :field1, :allow_nil=>true, :maximum=>255
[schema_validations] Example.validates_presence_of :field1
[schema_validations] Example.validates_uniqueness_of :field1, :allow_nil=>true, :if=>#<Proc:0x007f1c410c7998@/home/xxxx/bundle-root/ruby/2.3.0/gems/schema_validations-2.1.1/lib/schema_validations/active_record/validations.rb:173>
[schema_validations] Example.validates_length_of :field2, :allow_nil=>true, :maximum=>255
[schema_validations] Example.validates_presence_of :field2
[schema_validations] Example.validates_uniqueness_of :field2, :allow_nil=>true, :if=>#<Proc:0x007f1c410b8f88@/home/xxxx/bundle-root/ruby/2.3.0/gems/schema_validations-2.1.1/lib/schema_validations/active_record/validations.rb:173>
dmeranda commented 8 years ago

I have a potential fix which works for MySQL, and I can make it a pull request if you want, but I wasn't sure how it may affect PostgreSQL and others, given that schema_validations doesn't seem to use schema_plus_columns' case_sensitive? method in a direct manner, perhaps because of how it has a scope in the mix.

The fix is to has_case_insensitive_index? in validations.rb to first see if the column itself supports a case_sensitive? test before then looking at any indexes. If the column is case-insensitive, in MySQL anyway, it doesn't matter what any indexes do, the column is always case-insensitive.

        def has_case_insensitive_index?(column, scope)
+         if column.respond_to?(:case_sensitive?) && ! column.case_sensitive?
+            return true
+         end
          indexed_columns = (scope + [column.name]).map(&:to_sym).sort
          index = column.indexes.select { |i| i.unique && i.columns.map(&:to_sym).sort == indexed_columns }.first

          index && index.respond_to?(:case_sensitive?) && !index.case_sensitive?
        end
ronen commented 8 years ago

@dmeranda thanks for the detailed info. I think you're dead on. I don't think your change would break anything, but of course the way to find out is to push it and see if the test suite fails.

My only additional thoughts are:

dmeranda commented 8 years ago

Yes, your points are reasonable. I'm actually investigating schema_plus_columns now to see how or why it is working for me, and if it perhaps warrants a patch too.

I can start working on a pull request.

ronen commented 8 years ago

cool, thanks for looking into it

dmeranda commented 8 years ago

Regarding schema_plus_columns, it is working beacause it is using schema_monkey to patch the ActiveRecord::ConnectionAdapters::Column class. However when using MySQL, Rails actually defines and uses a set of MySQL-specific subclasses. In this case ActiveRecord::ConnectionAdapters::AbstractMysqlAdapter::Column.

So when you get an instance of Column and you call case_sensitive? on it, you are calling Rails' version of the method in the mysql-subclass. And since it does not call super (because the superclass in standard Rails does not even have such a method), then you never actually call schema_plus_column's version of the method; i.e., it remains hidden.

So it just happens to work, accidentally, at least under Rails 4. Though the documentation for schema_plus_columns could probably be updated to say that the method case_sensitive? is already provided by Rails for MySQL and it is Rails' version you'll get and not schema_plus_column's. And you only need schema_plus_pg_indexes when using PostgreSQL (obviously).

By the way, if I temporarily rename the case_sensitive? method so it is not hidden and then call it, it will definitely raise an error with MySQL because the IndexDefinition does not support a case_sensitive? method. It doesn't do a respond_to? check (though it probably should fail noisily anyway).