Closed onderkalaci closed 9 years ago
@jasonmp85 I also believe that the error message should be much more clear.
I created this pull request so that we may chat about the solution that I suggest.
Handling #9 on the planner is a reasonable solution, right? Could you please check my comments on the commit?
Another solution that may solve the issue might be to check the length of queryShardList
in the planner phase. However, the solution that I sent makes code more readable I think.
I chatted with @sumedhpathak about this and we thought this pull request could have drastically fewer lines of code if we put the check for an empty shard in an existing function rather than adding new functions.
The natural place would be LoadShardIntervalList
, except that master_create_worker_shards
has logic that expects it to return NIL
(i.e. we can't make it raise an error instead of returning NIL
because master_create_worker_shards
expects the NIL
when checking whether shards already exist for a table).
That leaves DistributedQueryShardList
. Can you just change that function to check the return value of LookupShardIntervalList
and error if it's NIL
(also update the comment on the function to reflect this new behavior)?
That change would be far fewer lines of code (probably no more than five?) and would accomplish the same end result. You should also be sure to add a test for the new behavior (add a query in queries.sql
after the distribution call has been made but before the shards have been created).
Two things:
DistributedQueryShardList
. Can you do that?With this implementation, we couldn't specify the name of the relation for which shards are not created in the ereport (Well, sure we can specify but code gets longer and become complicated). Thus, I had to use "the distributed table" phrase instead of its name in the ereport. Also, I think we are sure that there is only a single table, because we error out in the "ErrorIfQueryNotSupported" function if there are more than one relations involved in the query.
I was envisioning that the check would actually live in DistributedQueryShardList
rather than after the call to it. Right now that method always returns a list, which can be the empty list (NIL
). I'm proposing we change it to always return a non-empty list or error.
Getting the relation name would also be easy because within that method the distributedTableId
is in scope. It would just need a block like this:
List *prunedShardList = PruneShardList(distributedTableId, restrictClauseList, shardIntervalList);
if (prunedShardList == NIL)
{
char *relname = get_rel_name(distributedTableId);
ereport(ERROR, (errmsg("... %s ...", relname)));
}
return prunedShardList;
As for the parts of the error message:
errcode
— we should use ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE
errmsg
— should read could not find any shards for queryerrdetail
— should read No shards exist for distributed table "%s".errhint
— Don't need the quotation marks around the method name: _Run master_create_workershards to create shards and try again.My suggestions for the above are based on my reading of the style guide. I think we should avoid using quotes unless they contain dynamic strings that could contain space-separated words. If we know ahead of time that a function name looks like a function name, the quotes are "unnecessary" in the words of the style guide. Additionally, we should say could not when the user can take an action to fix something and cannot if it will remain impossible forever.
Alright, I'm going to add some checklist items to get this wrapped up. I'm being strict about error messages, but we've been lax about doing them properly, saying we'd clean them up later. So I just want to make sure we are being better about them going forward.
After the checklist items are complete I can probably merge this Thursday night or Friday (US time).
Added checklist up top! Push up some changes to address those things and we'll have a :shipit:!
Below are my comments on the changes:
Two small issues. Fix them and :shipit:. You can merge it yourself with the Merge pull request button or do it in the git
CLI.
Before shipping:
relName
to relationName
Before this fix, if you try to execute any query on tables which are distributed with
master_create_distributed_table
but no shards are created yet for the table (i.e.master_create_worker_shards
not called), you get unclear error message. This fix catches that case implicitly and more meaningful message is shown.fixes #9
Review tasks:
ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE
)NIL
check intoDistributedQueryShardList
itselfDistributedQueryShardList
's function comment to reflect its new "return non-empty list or error" behavior