citusdata / citus

Distributed PostgreSQL as an extension
https://www.citusdata.com
GNU Affero General Public License v3.0
10.5k stars 666 forks source link

Write regression test guidelines #1395

Open marcocitus opened 7 years ago

marcocitus commented 7 years ago

It's hard to remember all the rules that make for good regression tests. We should add a document to the wiki that succinctly describes the rules to follow.

Some basic ideas:

If applicable, test whether a new feature:

(don't need to have all of these all the time, but just good to remember pitfalls)

Do not:

(unless absolutely necessary, e.g. EXPLAIN always shows shard IDs)

anarazel commented 7 years ago

Don't use the same tables across different tests

That one I actually quite strongly disagree with - we shouldn't duplicate tables all the time. That slows things down, duplicates code, duplicates data loading (copy from file in the worst cases), etc. There's cases where that's necessary (e.g. when screwing with the table definition), but we really don't need 10 different lineitem_hash copies.

Points I'd make additionally:

lithp commented 7 years ago

It's probably worth adding reasons for the guidelines as well.


Don't use the same tables across different tests

That one I actually quite strongly disagree with - we shouldn't duplicate tables all the time. That slows things down, duplicates code, duplicates data loading (copy from file in the worst cases), etc. There's cases where that's necessary (e.g. when screwing with the table definition), but we really don't need 10 different lineitem_hash copies.

Of course we have no need for multiple iterations of setting up big tables like lineitem or supplier but I'm not sure Marco was suggesting that. There are lots of small tables which are created just for the purpose of a few checks and the benefits of having their definitions right next to the tests is well-worth the cost of a few extra create_distributed_table calls.