citusdata / pg_shard

ATTENTION: pg_shard is superseded by Citus, its more powerful replacement
https://github.com/citusdata/citus
GNU Lesser General Public License v3.0
1.06k stars 63 forks source link

Use master's port if worker port config absent #135

Closed jasonmp85 closed 9 years ago

jasonmp85 commented 9 years ago

CitusDB allows lines in pg_worker_list.conf to omit the port number, whereas pg_shard required it: confusing (or annoying) users of both.

This changes the sscanf format string to parse both the node name and node port as strings. If no node port is specified (or if that token starts with # i.e. begins a line comment), the default port is used; otherwise, we parse the node port string into an integer in precisely the same fashion used by CitusDB.

The reviewer of this request should consult the CitusDB functions LoadWorkerNodeList and ParseWorkerNodeLine in worker_node_manager.c to verify that the pg_shard implementation faithfully emulates CitusDB's implementation. Note that CitusDB uses functions from hba.c that are inaccessible to pg_shard, which is why the implementations differ.

I've removed the port number from the pg_worker_list.conf generated by our test launcher in order to exercise the new behavior.

onderkalaci commented 9 years ago

Hey @jasonmp85,

I added some comments to the code. But, there are some more points that I should mention.

There are some other points, that are not relevant with this change. I write them here for only documentation purposes and awareness.

jasonmp85 commented 9 years ago
  • Our aim is to emulate CitusDB's implementation.
  • With this change, CitusDB and pg_shard behaves almost the same. I observed a different behavior only on the following example: localhost 9700#test
  • In the above example, there is no white-space between the port number and # character. CitusDB can handle that, but, pg_shard errors out. I couldn't decide where is the best place to tackle this in the code.

Bah! Good catch. What do you think about setting the first # character to '\0'? Then that part of the string is ignored.

  • If the same host-port in the worker list, CitusDB uses only one of them. But, pg_shard uses both lines and behaves as if there are two different workers.

That's probably going to remain the same for the time being, but I'll ask @sumedhpathak and @ozgune for a quick opinion on it. If I fix it, it will be in a separate code review.

  • Since both CitusDB and pg_shard uses strtoul function, both read the following lines with no errors (but, different port number thatn 9700), since strtoul can parse negative and hex numbers.

Actually, I wanted to use pg_atoi and check for a negative return value to reject them. But that's incompatible with CitusDB. It's crazy that CitusDB passes 0 as the base, and I have to think that's some sort of oversight, but again I'll ask the local historians why that is the way it is. As far as negative numbers go, stroul is insane. It's basically just (unsigned long) strtol, so passing -5 gets you some insanely huge number and confuses the user. pg_atoi uses strtol(…, 10) and has the same error handling as us, so all we'd need to do is check for a negative value if we wish. I'll ask people about this, too.

jasonmp85 commented 9 years ago

Good catch. What do you think about setting the first # character to '\0'? Then that part of the string is ignored.

Based on the fact that this proposal would modify the input string (so comments wouldn't appear when the error message spits out the offending line), I'm going to go another direction. I think I can do this directly with sscanf.

jasonmp85 commented 9 years ago

OK, so what I went with was using sscanf to accept a series of characters for the hostname and port so long as they are not #, tab, or space. This looks pretty familiar if you know your regexes: [^# \t]. Quoth the manual:

[     Matches a nonempty sequence of characters from the specified set of
      accepted characters; the next pointer must be a pointer to char, and
      there must be enough room for all the characters in the string, plus a
      terminating NUL character.  The usual skip of leading white space is
      suppressed.  The string is to be made up of characters in (or not in)
      a particular set; the set is defined by the characters between the
      open bracket [ character and a close bracket ] character.  The set
      excludes those characters if the first character after the open
      bracket is a circumflex ^.  To include a close bracket in set, [...]

Since this mode doesn't skip leading whitespace, I changed the argument to sscanf from workerNodeLine (the whole line) to linePointer (which has advanced to the first non-whitespace character).

So this fixes the # character not offset by spaces problem and gets us pretty close to parity with CitusDB's parsing (which, being based on hba.c is too complicated to replicate here).

jasonmp85 commented 9 years ago

I decided to fix the remaining parse bugs. Now we reject negative numbers and numbers expressed in anything except decimal. In addition, I discovered fgets leaves trailing newlines in its result, so we need to remove them (similar to what PostgreSQL does in hba.c).

onderkalaci commented 9 years ago

Hey @jasonmp85,

I think the changes are OK now. I only added two minor comments. Feel free to ignore those and merge the change if you think they are irrelevant. But, if you change the code, let me test once more before the merge, since I already have plenty of corner case tests.

jasonmp85 commented 9 years ago

if you change the code, let me test once more before the merge, since I already have plenty of corner case tests

I removed the *nodePortString != '#' check but will merge regardless, since you're right that it's clear from inspection that nodePortString can never begin with a #. The sscanf format we provide uses exclusion to ensure # is never in that field, so if the field was assigned, it cannot include (or start with) a #. Going to merge after the CI tests pass.