citusdata / pg_shard

ATTENTION: pg_shard is superseded by Citus, its more powerful replacement
https://github.com/citusdata/citus
GNU Lesser General Public License v3.0
1.06k stars 63 forks source link

Add comprehensive unit/integration tests #34

Closed jasonmp85 closed 9 years ago

jasonmp85 commented 9 years ago

My methodology here is to expose each file's extern functions through some sort of SQL-visible function such that it becomes possible to test each piece of pg_shard's functionality in a fairly modular fashion.

Where this was burdensome, (planner/executor), end-to-end integration tests were added instead.

Certain SQL features (WINDOW functions, etc.) still lack tests, but pg_shard's coverage is vastly improved by these changes.

We continue to use $PGPORT to interpolate port numbers into tests (just a few sed commands), but have added a launcher wrapper to move any pg_worker_list file aside during test in order to substitue a known file to exercise everything we need during the tests.

I think long-term, we will want to look at something like cmocka to help write actual unit tests. Integration tests should be performed in an actual test-only cluster built by our build logic, as opposed to the usuall make installcheck practice of building against the user's DB. But we'll consider that when and if we outgrow this.

Resolves #23.

jasonmp85 commented 9 years ago

Before/after coverage statistics courtesy of lcov:

Before

Coverage in develop before this patch

After

Coverage in develop after this patch

sumedhpathak commented 9 years ago

We continue to use $PGPORT to interpolate port numbers into tests (just a few sed commands), but have added a launcher wrapper to move any pg_worker_list file aside during test in order to substitue a known file to exercise everything we need during the tests.

So does this still require us to start a cluster manually? Or do we just need a master node running? By move, you mean an existing pg_worker_list.conf will be moved aside, and replaced with this?

jasonmp85 commented 9 years ago

There must be a cluster running, as is the practice with make installcheck. I mean it moves the literal file aside (to a .bak backup file) and replaces it with a modified version containing known values.

sumedhpathak commented 9 years ago

Regarding the lcov report, how much extra coverage is being added by the unit tests exposed via new functions?

jasonmp85 commented 9 years ago

I'm unclear on what you mean by "how much coverage is being added…"

The coverage numbers do not count the lines of C code in the test functions: look at the file names and you'll see only the "operative" files are being included in the lcov report. I told it to only count the top directory (so it ignores the test/* files), and then removed the ruleutils file from the report (because we cannot reasonably test all its paths using pg_shard, which doesn't even support all the features it is capable of deparsing).

So in short: the coverage screenshots I posted count only the "main" files we've written for pg_shard, excepting the small tweaks to ruleutils for the reasons stated above.

jasonmp85 commented 9 years ago

When I'm at my most pedantic, I like to record coverage for a file X only during the unit tests for file X. In that way, calls to X's functions from the tests for file Y do not contribute toward X's coverage. This means coverage only "counts" if it was accumulated during unit tests for that file, not from calls that were made as side effects of other functions' unit tests. Trivially, this avoids the case where a project writes a few end-to-end tests and then says "oh hey, we have 100% coverage!" even though they're never really asserting anything meaningful about the intermediate layers of their architecture.

That's pretty annoying to implement, though, and true unit tests would require cmocka (or equivalent, as noted previously), so for now I've gone with the "as isolated as possible" approach and added in a few end-to-end tests for good measure.

My overall goal was for any added C code to be trivial and for any SQL to be not much more than a bunch of copy-pasteable stuff. I don't usually apply the same standards of maintainability to test code because it's just a scaffold to exercise the real stuff. If anything looks too complicated for inclusion, let me know.

jasonmp85 commented 9 years ago

Discussed in a Hangout. I've tested this on Ubuntu and OS X against PostgreSQL 9.3, 9.4, and CitusDB. Looks good, got a :shipit: from @sumedhpathak.

jasonmp85 commented 9 years ago

My short list of things we haven't tested: