citusdata / pg_shard

ATTENTION: pg_shard is superseded by Citus, its more powerful replacement
https://github.com/citusdata/citus
GNU Lesser General Public License v3.0
1.06k stars 63 forks source link

Feature issue#16 #73

Closed onderkalaci closed 9 years ago

onderkalaci commented 9 years ago

This pull request aims to prevent tables to be distributed with columns that cannot be used as distribution column.

fixes #16

Code Review Tasks

coveralls commented 9 years ago

Coverage Status

Coverage increased (+0.0%) to 93.21% when pulling 20b17d685940c888f95564b5b0a3dfd15c217a6a on feature-issue#16 into 036e3a8969ce19d862b71680134e54d1134ef8ec on develop.

jasonmp85 commented 9 years ago

Hey @onderkalaci — I know I said I'd email @sumedhpathak and @ozgune about breaking the error checking out into its own function, but what I hadn't realized is that master_create_distributed_table is all error checking! The GitHub view cut off right after the call to InsertPartitionRow, so I didn't see that we return immediately after.

So there is no argument to support breaking error handling into its own function here: it makes sense that master_create_distributed_table is all error handling code since the logic is so simple at this point (a single function call).

Which just leaves my point about the inequality operator checks. I'll see whether there are any other stylistic things that need addressing and add a checklist at the top of this review. After those changes I think we're good to merge.

onderkalaci commented 9 years ago

Hey @jasonmp85, @sumedhpathak

Below I summerize what is included in the updated pull request:

jasonmp85 commented 9 years ago

Ok, so after staring at this for a while, I think I know what I'd like to see happen…

In some code you haven't modified, there's a section like:

partitionColumnId = get_attnum(distributedTableId, partitionColumnName);
if (partitionColumnId == InvalidAttrNumber)
{
    ereport(ERROR, (errmsg("could not find column: %s", partitionColumnName)));
}

This error checking is duplicated by ColumnNameToColumn, which you call later. I think the above section should be removed entirely and replaced by the call to ColumnNameToColumn (step one).

Then, GetSupportRoutine should be renamed SupportFunctionForColumn and it should be modified to accept that Var * (step two).

If the type does not have a default operator class, SupportFunctionForColumn should call ereport itself with a suitable error message (step three).

I think this could help code flow quite a bit while cutting down on verbosity and duplication.

jasonmp85 commented 9 years ago

Ok @onderkalaci — I've left my checklist at the top of this PR. I think the reorganization is enough that I'll want to check this out once more before a merge, but assuming it's mostly in shape by morning (MDT) I'll merge it myself if it looks good.

onderkalaci commented 9 years ago

@jasonmp85 Thanks for the review. I think that now code looks better than previous implementation. But I couldn't understand why coverage decreased even I added tests. Is it a problem?

jasonmp85 commented 9 years ago

@onderkalaci By default, Coveralls will complain if coverage decreases. We discussed the coverage of your patch (and how it couldn't be complete until we support range partitioning) and I expected coverage to decrease slightly.

If you notice, the merge button is still present, and nothing can ever stop any of us from merging from the command line. So it's just a slight inconvenience designed to make sure you are aware you lowered coverage.

In this case, it's expected, so no biggie.

jasonmp85 commented 9 years ago

I've tested this on the latest CitusDB, PostgreSQL 9.3, and PostgreSQL 9.4 on OS X. Travis'll take care of Linux. After it passes I'm gonna merge.