jackc / surus

PostgreSQL extensions for ActiveRecord
MIT License
395 stars 35 forks source link

Code styling cleanup + Refactor Serializer #20

Closed leaexplores closed 8 years ago

leaexplores commented 8 years ago

Hi JackC,

In this PR i've made the following improvements :

  1. Add Rubocop to the project and make it 100% fully compliant to the ruby style guide.
  2. Refactor a few methods in lib/surus/hstore/serializer.rb to make it easier to read.
  3. All the string usage only that doesn't involve interpolation are now ' instead of ".

Tests should now be running on both jruby main versions + rubocop run.

Thanks for looking into it!

Review on Reviewable

leaexplores commented 8 years ago

It may take a few commits in there though before we get everything passing everywhere with retro-compatibility

leaexplores commented 8 years ago

Thinking about it, I'll open a new PR from Jruby compat later on as I have to patch Hstore support on the active record jdbc adapter before.

I've rebased to only keep the refactors + code styling fixes.

jackc commented 8 years ago

Thanks for sending this PR, I'm sure this took quite a bit of time.

Can this be separated in to multiple pull requests? There are 700+ changed lines and several logical changes. I tried to review the change, and I found several items I want to ask about, but its easy to miss things in a large commit.

Here are a few of the things I noted:

  1. 'database.yml' changed to run as the postgres user. No default will work for everyone out of the box, but I don't see how that is better than just using the logged in user. Under the normal Ubuntu PostgreSQL installation I would have to sudo -u postgres bundle exec rake to get that to work.
  2. bench/syncronous_commit.rb:43 is a regression. It used to print out the number and types of records. But now it uses single quotes so it doesn't do the interpolation. Then, since text is no longer used it appears that the block argument was renamed to _text to indicate it is not used. Not sure if it was an intentional change or if it was Rubocop run a muck... If the latter, it makes me nervous to merge in it's automatic rewriting of code; it may have made such a mistake in a more critical area.
  3. The refactor of Surus::HStore::Serializer#stringify is more concise, but it left me with some confusion. Why treat String special on line 87 instead of merging in with the general case on 88. And on line 88, why check value.is_a?(type) and value.class == type? Isn't is_a a subset of class == type?
  4. There's an awful lot of code churn. I'm wary of too much refactoring just for refactoring sake. I've seen a lot of times where cleaning up some working code introduced bugs. I'm a fan of standardized formatting; I think gofmt is one of the really smart things about Go. But I'm not sure if I totally trust Rubocop, at the very least I'd like to see the changes it makes in an isolated commit rather than intermingled with human-made changes.
  5. I don't get the extraction of spec/json/serializer.rb. It's only a test so strongly binding to oj shouldn't matter. But then the abstracted interface binds tightly to oj anyway. Oh, your comment mentioned wanting to get JRuby compatibility in the future. Maybe this is preparation for that? I'm guessing oj isn't available on JRuby...
  6. There is a new empty file in this commit. lib/surus/json/serializer.rb

There are some good changes in this PR, but it's hard to get a good understanding of them when the commit is so large and touches so many areas.

Thanks again.

leaexplores commented 8 years ago

Hi JackC,

Thanks for passing by, Let's go point by point so we can get this through and make some improvements.

  1. This is indeed a nit ! I'm removing it !
  2. Yes by default of the ruby style guide, unused variables should be prefixed with something. I didn't mean to trim them on the go as I wasn't sure if it was left for re-enabling the feature later on. (https://github.com/bbatsov/ruby-style-guide#underscore-unused-vars)

If we want to allow the unused variables without triggering warning, we could disable this check in the rubocop.yml and restore the non prefixed variables.

  1. No when you use a kind_of? and .is_a? are synonyms.

If you declare a FixNum and compare it with Integer. Both will repond true. See

2.2.2 :003 > qq = -12345
 => -12345
2.2.2 :004 > qq.is_a? Fixnum
 => true
2.2.2 :005 > qq.is_a? Integer
 => true

What we want to do here is have a strict type check and else use YAML to serialize it.

  1. Rubocop pretty much only lints the code. It doesn't reformat it. I'm not aware that a popular tool that uses the ruby style guide for auto formatting exists yet. It would be a really cool project though! Let's go through it slowly and could use a tool like reviewable to review the code easily.
  2. It is some preparation for JRuby compat. Do you want me to revert this change so we only keep the code styling and the small refactors I made in there ?
  3. Indeed a nit ! I'm removing it !
leaexplores commented 8 years ago

I've opened a review on reviewable.

You should be able to access it from here https://reviewable.io/reviews/jackc/surus/20


Review status: 0 of 37 files reviewed at latest revision, all discussions resolved.


Comments from the review on Reviewable.io

jackc commented 8 years ago

Let me start by thanking you again for your investment of time into Surus. However, my review did find a number of bugs and other changes that make me uncomfortable. So let me apologize in advance if some of this comes across as negative, but I hope that my suggestions improve both your experience of working on Surus and participating in OSS in general.

This PR really should be in multiple commits, and ideally multiple PRs. There are too many logical changes bundled into one commit. e.g.

That's 10 changes and the list is incomplete. Right now there is no easy way to merge one change and ignore another.

This PR is just too big and covers too many things. To be honest, I was initially tempted just to immediately reject it because it is too difficult to review. As it is, I have spent several hours reviewing it.

On the whole, I'm very wary of making changes strictly for stylistic reasons. It introduces risk of regressions for no immediate reward. This is not merely a theoretical concern -- I found 3 regressions caused by incorrectly converting double quotes to single quotes (I noted these in the reviewable). What if I didn't find them all?

I'm also wary of refactoring that is not directly driven by either maintentance concerns or new feature requirements. For example, some of the logic in lib/surus/json/query.rb is rather subtle. I'd rather not cause any code churn without good reason. Also, I find that I often prefer one larger function vs. multiple smaller functions that only have one caller (see https://news.ycombinator.com/item?id=8374345 for some discussion of this -- both the original post by John Carmack and the Hacker News comments are thought-provoking.)

As far as some general suggestions for PRs, it is often a good idea to open an issue for discussion before making changes. Especially is this true for stylistic and refactoring changes.

For example, there are a number of changes that replaced an if statement with guard clauses. But I find this:

def f
  if foo
    bar
  else
    baz
  end
end

more clear than this:

def f
  return bar if foo
  baz
end

Obviously, this is personal preference, but that makes checking before making changes even more valuable.

To summarize, I'd like to pull in the JSON changes necessary for JRuby (see comments in reviewable for what those changes are) and subsequent JRuby fixes, but I don't want to pull in the bulk style changes, and I'd need to be persuaded that each individual refactoring change justifies the code churn.

jackc commented 8 years ago

Reviewed 37 of 37 files at r1. Review status: all files reviewed at latest revision, 19 unresolved discussions.


.travis.yml, line 21 [r1] (raw file): What will rubocop do here? Is it strictly advisory or will it break the build if some criteria is not met?


_bench/hstore_find.rb, line 89 [r1] (raw file):_ What is the benefit of extracting this?


_bench/synchronous_commit.rb, line 32 [r1] (raw file):_ Not sure how I feel about this style. Unless an editor does auto-formatting, a change that alters the first line requires manually re-indenting all the subsequent lines.


_bench/synchronous_commit.rb, line 43 [r1] (raw file):_ This should be double quotes to do the interpolation. The above _text would also go back to text.


_lib/generators/surus/hstore/install_generator.rb, line 12 [r1] (raw file):_ I like the format change. I'm not sure about the guard clause style change. To me guard clauses are usually for defending against bad inputs, or on occasion to avoid excessive nesting. But this really is a simple if then else.


_lib/generators/surus/hstore/templates/install_hstore.rb, line 1 [r1] (raw file):_ I suppose this is a bit more elegant, but the existing code would likely never need to change. I suppose it's fine, but it makes me nervous to incur the risk of changes without concrete benefits.


lib/surus/hstore/serializer.rb, line 88 [r1] (raw file): I might be totally missing something, but I still can't think of a case where value.class == type would be true and value.is_a?(type) would be false. obj.is-a? will check the entire inheritance path of obj -- by definition obj.class is in that inheritance path.


lib/surus/hstore/serializer.rb, line 107 [r1] (raw file): To be honest, I prefer the case vs. a bunch of guard clauses.


_lib/surus/json/query.rb, line 26 [r1] (raw file):_ Unless scope is used elsewhere, I prefer it inline. See http://number-none.com/blow/john_carmack_on_inlined_code.html for some reasoning behind this.

Also, even if it is extracted to a method, I find the original if statement clearer.


lib/surus/json/query.rb, line 39 [r1] (raw file): You're probably tired of hearing it by now, but I think if / then / else is preferable to guard clauses when it really is just a simple branch.


lib/surus/json/query.rb, line 98 [r1] (raw file): There's a lot of tricky logic in the above code. I'm very wary of changing it simply for style. (Carmack's argument applies here as well).


spec/database.yml, line 4 [r1] (raw file): As previously discussed, let's nix this change.


_spec/hstore/scope_spec.rb, line 40 [r1] (raw file):_ I'm not sure what benefit extracting this method brings.


_spec/hstore/scope_spec.rb, line 68 [r1] (raw file):_ Same as above, don't see benefit of this extraction.


_spec/hstore/serializer_spec.rb, line 15 [r1] (raw file):_ "'" seems easier to understand than '\''


_spec/hstore/serializer_spec.rb, line 31 [r1] (raw file):_ The above changes break the tests in the worst way - the tests still pass, but they don't test anything. The string interpolations are necessary to produce the test values.


_spec/hstore/serializer_spec.rb, line 60 [r1] (raw file):_ Another place where overeager converting of double quotes to single quotes causes a regression...


_spec/json/json_spec.rb, line 2 [r1] (raw file):_ This serializer extraction makes sense if we will need to use separate JSON parsers for MRI and JRuby. But I'm not stuck on the need to use Oj. (I think it just happened to be the popular parser at the the time I wrote this). It might be simpler just to use a JSON parser that works with MRI and JRuby (such as https://rubygems.org/gems/json_pure) and avoid the abstraction altogether. Failing that, perhaps use a well tested JSON abstraction such as https://github.com/intridea/multi_json.


spec/json/serializer.rb, line 1 [r1] (raw file): This implementation looks fine, though as I mentioned in the previous file, I'm not sure it is needed.


Comments from the review on Reviewable.io

leaexplores commented 8 years ago

Review status: all files reviewed at latest revision, 19 unresolved discussions, some commit checks failed.


.travis.yml, line 21 [r1] (raw file): Yes, if the code styling standards are not respected. The travis CI build will fail.


_bench/hstore_find.rb, line 89 [r1] (raw file):_ Freezing strings allows to get improved performance on ruby runtime over 2.1. (See https://bugs.ruby-lang.org/issues/9159 and https://bugs.ruby-lang.org/issues/9171)

Some apps may get up to 30% speed boost when we use a lot of strings.


_bench/synchronous_commit.rb, line 43 [r1] (raw file):_ Yeup indeed.


lib/surus/hstore/serializer.rb, line 107 [r1] (raw file): Cool stuff, i'll rollback this one.


_spec/json/json_spec.rb, line 2 [r1] (raw file):_ MultiJSON would be pretty much future proof on all the ruby runtime while keeping a nice runtime speed (like leave Oj for the MRI Implementation).


Comments from the review on Reviewable.io

leaexplores commented 8 years ago

Hi JackC,

From the reviews, I'll break most of the changes into small PR and refer to this one as a tracking issue.

We'll be able to close it when i've moved most of the interesting improvements.

Code styling wise, i'll generate a basic ignore files for the code styling improvements and we could tweak it from there with your code styling preferences. So we don't enforce what feels too aggresive (IE Guard clauses VS if else ).

Thanks a lot again for sharing your knowledge with me.

leaexplores commented 8 years ago

Most of the improvements were added in the various PR, i'm closing this one too!