Tools for making our tests easier to run. Automates setting up a cluster with Azure/Cloudformation and installs a script which automates setting up citus and everything required for testing citus.
You can find more information about every step below in other categories. This list of commands show how to get started quickly. Please see other items below to understand details and solve any problems you face.
az cli
in your local to continue. Install instructionsaz login
to make the CLI log in to your account# List your subscriptions
az account list
# Pick the correct one from the list and run
az account set --subscription {uuid-of-correct-subscription}
If your subscriptions list doesn't contain Azure SQL DB Project Orcas - CitusData
, to add it, contact someone who is authorized.
You should use ssh-agent
to add your ssh keys, which will be used to upload to the release-test-results repository. Note that your keys are kept only in memory, therefore this is a secure step.
# start ssh agent
eval `ssh-agent -s`
# Add your Github ssh key to upload results to release-test-results repo
ssh-add
You should setup your VPN to be able to connect to Azure VM-s if your tests are not running on GHA. Doing this as of latest consists of:
routes.ps1
(on Windows only, if you are developing on Mac you should probably ping smn from the team for help). The script requires
python
to be installed to run.In azuredeploy.parameters.json
file, you will see the parameters that you can change. For example if you want to change the number of workers, you will need to change the parameter numberOfWorkers
. You can change the type of coordinator and workers separately from the parameters file. Also by default for workers, memory intense vms are used(E type) while for coordinator CPU intense vms are used(D type).
After you run tests, you can see the results in results
folder. The results
folder will have the name of the config used for the test.
The data will be stored on the attached disk, size of which can be configured in the parameters.
If you dont specify the region, a random region among eastus
, west us 2
and south central us
will be chosen. This is to use resources uniformly from different regions.
Port 3456 is used for ssh, you can connect to any node via Port 3456, if you don't use this node, you will hit the security rules.
You will need to follow these steps to create a cluster and connect to it, on your local machine:
# in the session that you will use to ssh, set the resource group name
export RESOURCE_GROUP_NAME=give_your_name_citus_test_automation_r_g
# if you want to configure the region
# export AZURE_REGION=eastus2
# Go to the azure directory to have access to the scripts
cd azure
# open and modify the instance types/discs as you wish
less azuredeploy.parameters.json
# Quickly start a cluster of with defaults. This will create a resource group and use it for the cluster.
./create-cluster.sh
# connect to the coordinator
./connect.sh
After you are done with testing you can run the following the delete a cluster and the relevant resource group:
# Delete the formation
# It's a good practice to check deletion status from the azure console
./delete-resource-group.sh
Depending of the tests you trigger here, you can block at most 3 jobs slots in GHA for around 3 hours. Choose wisely the time you want to run the tests to not block development
If you want, you can run trigger a job which can run pgbench, scale, tpch and extension tests. What the job does is:
There is a separate job for each test and you can run any combinations of them. To trigger a job, you should create a branch which has specific prefixes.
pgbench/
, then pgbench job will be triggered.scale/
, then scale job will be triggered.tpch/
, then tpch job will be triggered.all_performance_test/
, then all jobs will be triggered.extension/
, then extension job will be triggered.You should push your branch to Github so that the GHA job will be triggerred.
Each job uses a specific resource group name so that there will be at most 3 resource groups for these jobs. If there is already a resource group, then you should make sure that:
If not, then you can delete the resource group name from portal, you can find it by search the prefix citusbot
. Under normal circumstances the resource group will already be deleted at the end of the test
even if it fails.
You can find your test results in https://github.com/citusdata/release-test-results under periodic_job_results
folder. Test results will be pushed to a branch which is in the format ${rg_name}/${month_day_year_uniqueID}.
By default the tests will be run against release-9.2
and the latest released version. If you want to test on a custom branch you should change the config files of relevant tests with your custom branch name in:
postgres_citus_versions: [('12.1', 'your-custom-branch-name-in-citus'), ('12.1', 'release-9.1')]
Note: While you can run multiple tests by adding more elements to the array above, the results of the tests after the first might be inflated due to cache hits (this depends on the tests being run and the type of disks being used by the VM-s). For the fairest possible comparisons, consider running the tests seperately.
You can change all the settings in these files, the config files for tests are located at:
By default, the following tests will be run for each test:
pgbench_default.ini
and pgbench_default_without_transaction.ini
scale_test.ini
tpch_default.ini
extension_default.ini
If you dont want to use default cluster settings(instance types etc), you can change them in https://github.com/citusdata/test-automation/blob/master/azure/azuredeploy.parameters.json.
If you want to change how long each test will be run, you can change the times with the -T
parameter. https://github.com/citusdata/test-automation/blob/master/fabfile/pgbench_confs/pgbench_default.ini#L33
pgbench_command: pgbench -c 32 -j 16 -T <test time in seconds> -P 10 -r
Important: Push your branch to the github repo even though the HammerDb tests are run from your local. The initiliazer script used to setup the Azure VM-s will pull your branch from github and not from your local.
Hammerdb tests are run from a driver node. Driver node is in the same virtual network as the cluster.
You can customize the hammerdb cluster in the hammerdb
folder using hammerdb/azuredeploy.parameters.json
.
Note that this is the configuration for the cluster, which is separate than benchmark configurations(fabfile/hammerdb_confs/
)
In fabfile/hammerdb_confs
you can add more configs to this folder:
You can add as many configs as you want to fabfile/hammerdb_confs
folder and the automation tool will
run the benchmark for each config. So if you want to compare two branches, you can create two identical config files with two different branches. (Note that you can also use git refs instead of branch names)
Even though the script will vacuum the tables in each iteration to get more accurate results, the disk
cache is likely to inflate the results of the tests running after the first file so for the most unbiased results
test the setups seperately (repeat this produre twice).
The result logs will contain the config file so that it is easy to know which config was used for a run.
After adding the configs fabfile/hammerdb_confs
could look like:
In order to run hammerdb benchmark:
eval `ssh-agent -s`
ssh-add
export RESOURCE_GROUP_NAME=<your resource group name>
export GIT_USERNAME=<Your github username>
export GIT_TOKEN=<Your github token with repo, write:packages and read:packages permissions> # You can create a github token from https://github.com/settings/tokens
cd hammerdb
# YOU SHOULD CREATE A NEW BRANCH AND CHANGE THE SETTINGS/CONFIGURATIONS IN THE NEW BRANCH
# AND PUSH THE BRANCH SO THAT WHEN THE TOOL CLONES THE REPOSITORY
# IT CAN DOWNLOAD YOUR BRANCH.
vim fabfile/hammerdb_confs/<branch_name>.ini # verify that your custom config file is correct
./create-run.sh
# you will be given a command to connect to the driver node and what
# to run afterwards.
After running ./create-run.sh you do not have to be connected to the driver node at all, it will take care of the rest for you.
The cluster deployment is flaky and sometimes it will fail. This behaviour is somewhat rare so it is not a big problem. In that case, simply delete the previous resource group, and try again. You can do that with:
# running from the same shell where you called create-run.sh to start the test
../azure/delete-resource-group.sh
./create-run.sh
If it is persistent, some policy might have been changed on Azure so either consider debugging the issue, or opening an issue in test-automation.
The cluster will be deleted if everything goes okay, but you should check if it is deleted to be on the safe side.(If it is not, you can delete that with azure/delete-resource-group.sh or from the portal).
In order to see the process of the tests, from the driver node:
./connect-driver.sh
screen -r
You can see the screen logs in ~/screenlog.0
.
You will see the results in a branch hammerdb_date_id
in https://github.com/citusdata/release-test-results.
You won't get any notifications for the results, so you will need to manually check it.
What files are pushed to github:
hammerdb/build.tcl
creates and fills hammerdb tpcc tables. You should have at least 1:5 ratio for vuuser:warehouse_count otherwise the build.tcl might get stuck.
hammerdb/run.tcl
runs tpcc benchmark. You can configure things such as test duration here.
Note that running a benchmark with a single config file with a vuuser of 250 and 1000 warehouses could take around 2-3 hours. (the whole process)
If you want to run only the tpcc benchmark or the analytical queries, you should change the is_tpcc
and is_ch
variables in create-run.sh
. For example if you want to run only tpcc benchmarks, you should set is_tpcc
to true
and is_ch
to false
(Alternatively you can see IS_CH
and IS_TPCC
environment variables). When you are only running the analytical queries, you can also specify how long you want them to be run by changing the DEFAULT_CH_RUNTIME_IN_SECS
variable in build-and-run.sh
. By default it will be run 3600 seconds.
You can change the thread count and initial sleep time for analytical queries from build-and-run.sh
with CH_THREAD_COUNT
and RAMPUP_TIME
variables respectively.
If you want to run hammerdb4.0 change hammerdb_version
to 4.0
in create-run.sh
.
By default a random region will be used, if you want you can specify the region with AZURE_REGION
environment variable prior to running create-run.sh
such as export AZURE_REGION=westus2
.
Currently, only testing compatibility with jdbc is automated.
To run, create a branch called jdbc/{whatever-you-want}
and push to origin.
The citus branch and jdbc version can be configured from JDBC Config
For more details read: JDBC README
On the coordinator node:
# Setup your test cluster with PostgreSQL 12.1 and Citus master branch
fab use.postgres 12.1 use.citus master setup.basic-testing
# Lets change some conf values
fab pg.set-config max_wal_size "'5GB'"
fab pg.set-config max_connections 1000
# And restart the cluster
fab pg.restart
If you want to add the coordinator to the cluster, you can run:
fab add.coordinator-to-metadata
If you want the coordinator to have shards, you can run:
fab add.shards-on-coordinator
On the coordinator node:
# This will run default pgBench tests with PG=12.1 and Citus 9.2 and 8.3 release branches
# and it will log results to pgbench_results_{timemark}.csv file
# Yes, that's all :) You can change settings in fabfile/pgbench_confs/pgbench_default.ini
fab run.pgbench-tests
# It's possible to provide another configuration file for tests
# Such as with this, we run the same set of default pgBench tests without transactions
fab run.pgbench-tests --config-file=pgbench_default_without_transaction.ini
On the coordinator node:
# This will run scale tests with PG=12.1 and Citus 9.2 and 8.3 release branches
# and it will log results to pgbench_results_{timemark}.csv file
# You can change settings in files under the fabfile/pgbench_confs/ directory
fab run.pgbench-tests --config-file=scale_test.ini
fab run.pgbench-tests --config-file=scale_test_no_index.ini
fab run.pgbench-tests --config-file=scale_test_prepared.ini
fab run.pgbench-tests --config-file=scale_test_reference.ini
fab run.pgbench-tests --config-file=scale_test_foreign.ini
fab run.pgbench-tests --config-file=scale_test_100_columns.ini
You can execute a PG extension's regression tests with any combination of other extensions created in the database. The purpose of those tests is to figure out if any test fails due to having those extensions together. Currently we only support extensions whose tests can be run by pg_regress. We do not support any other extensions whose tests are run by some other tools. e.g. tap tests
Here is the schema for main section:
[main]
postgres_versions: [<string>] specifies Postgres versions for which the test should be repeated
extensions: [<string>] specifies the extensions for which we give information
test_count: <integer> specifies total test scenarios which uses any extensions amongst the extension definitions
[main]
postgres_versions: ['14.5']
extensions: ['citus', 'hll', 'topn', 'tdigest', 'auto_explain']
test_count: 4
Here is the schema for an extension definition:
[<string>] specifies the extension name (that should be the same name with the extension name used in 'create extension <extension_name>;')
contrib: <bool> specifies if the extension exists in contrib folder under Postgres (we do not install if it is a contrib extension because it is bundled with Postgres)
preload: <bool> specifies if we should add the extension into shared_preload_libraries
create: <bool> specifies if we should create extension in database (for example: 'create extension auto_explain;' causes error because it does not add any object)
configure: <bool> specifies if the installation step has a configure step (i.e. ./configure)
repo_url: <string> specifies repo url for non-contrib extension
git_ref: <string> specifies repo branch name for non-contrib extension
relative_test_path: <string> specifies relative directory in which pg_regress will run the tests
conf_string: <string> specifies optional postgres.conf options
post_create_hook: <string> specifies optional method name to be called after the extension is created. You should implement the hook in fabfile/extension_hooks.py.
[tdigest]
contrib: False
preload: False
create: True
configure: False
repo_url: https://github.com/tvondra/tdigest.git
git_ref: v1.4.0
relative_test_path: .
Here is the schema for a test case:
[test-<integer>] specifies the test name
ext_to_test: <string> specifies the extension to be tested
dep_order: <string> specifies the shared_preload_libraries string order
test_command: <string> specifies the test command
conf_string: <string> specifies the postgres configurations to be used in the test
[test-4]
ext_to_test: citus
dep_order: citus,auto_explain
test_command: make check-vanilla
conf_string: '''
auto_explain.log_min_duration=0
auto_explain.log_analyze=1
auto_explain.log_buffers=1
auto_explain.log_nested_statements=1
'''
On the coordinator node:
# This will run default extension tests with PG=14.5
# Yes, that's all :) You can change settings in fabfile/extension_confs/extension_default.ini
fab run.extension-tests
# It's possible to provide another configuration file for tests
fab run.extension-tests --config-file=[other_config.ini]
You can execute a PG extension's regression tests with any combination of other extensions created in the database. The purpose of those tests is to figure out if any test fails due to having those extensions together. Currently we only support extensions whose tests can be run by pg_regress. We do not support any other extensions whose tests are run by some other tools. e.g. tap tests
Here is the schema for main section:
[main]
postgres_versions: [<string>] specifies Postgres versions for which the test should be repeated
extensions: [<string>] specifies the extensions for which we give information
test_count: <integer> specifies total test scenarios which uses any extensions amongst the extension definitions
[main]
postgres_versions: ['14.5']
extensions: ['citus', 'hll', 'topn', 'tdigest', 'auto_explain']
test_count: 4
Here is the schema for an extension definition:
[<string>] specifies the extension name (that should be the same name with the extension name used in 'create extension <extension_name>;')
contrib: <bool> specifies if the extension exists in contrib folder under Postgres (we do not install if it is a contrib extension because it is bundled with Postgres)
preload: <bool> specifies if we should add the extension into shared_preload_libraries
create: <bool> specifies if we should create extension in database (for example: 'create extension auto_explain;' causes error because it does not add any object)
configure: <bool> specifies if the installation step has a configure step (i.e. ./configure)
repo_url: <string> specifies repo url for non-contrib extension
git_ref: <string> specifies repo branch name for non-contrib extension
relative_test_path: <string> specifies relative directory in which pg_regress will run the tests
conf_string: <string> specifies optional postgres.conf options
post_create_hook: <string> specifies optional method name to be called after the extension is created. You should implement the hook in fabfile/extension_hooks.py.
[tdigest]
contrib: False
preload: False
create: True
configure: False
repo_url: https://github.com/tvondra/tdigest.git
git_ref: v1.4.0
relative_test_path: .
Here is the schema for a test case:
[test-<integer>] specifies the test name
ext_to_test: <string> specifies the extension to be tested
dep_order: <string> specifies the shared_preload_libraries string order
test_command: <string> specifies the test command
conf_string: <string> specifies the postgres configurations to be used in the test
[test-4]
ext_to_test: citus
dep_order: citus,auto_explain
test_command: make check-vanilla
conf_string: '''
auto_explain.log_min_duration=0
auto_explain.log_analyze=1
auto_explain.log_buffers=1
auto_explain.log_nested_statements=1
'''
On the coordinator node:
# This will run default extension tests with PG=14.5
# Yes, that's all :) You can change settings in fabfile/extension_confs/extension_default.ini
fab run.extension-tests
# It's possible to provide another configuration file for tests
fab run.extension-tests:[other_config.ini]
Note: You should export EXTENSION_TEST=1
before running create-cluster.sh
if you plan to run extension tests.
On the coordinator node:
# Use pgbench_cloud.ini config file with connection string of your Hyperscale (Citus) cluster
# Don't forget to escape `=` at the end of your connection string
fab run.pgbench-tests --config-file=pgbench_cloud.ini --connectionURI='postgres://citus:HJ3iS98CGTOBkwMgXM-RZQ@c.fs4qawhjftbgo7c4f7x3x7ifdpe.db.citusdata.com:5432/citus?sslmode\=require'
Important Note
conf_string
is optional both for extension and test case definitions.On the coordinator node:
# This will run TPC-H tests with PG=12.1 and Citus 9.2 and 8.3 release branches
# and it will log results to their own files on the home directory. You can use diff to
# compare results.
# You can change settings in files under the fabfile/tpch_confs/ directory
fab run.tpch-automate
# If you want to run only Q1 with scale factor=1 against community master,
# you can use this config file. Feel free to edit conf file
fab run.tpch-automate --config-file=tpch_q1.ini
On the coordinator node:
# Provide your tpch config file or go with the default file
# Don't forget to escape `=` at the end of your connection string
fab run.tpch-automate --config-file=tpch_q1.ini --connectionURI='postgres://citus:dwVg70yBfkZ6hO1WXFyq1Q@c.fhhwxh5watzbizj3folblgbnpbu.db.citusdata.com:5432/citus?sslmode\=require'
TL;DR
# set the appropriate az account subscription
az account set --subscription <subscriptionId>
# setup the ssh-agent and pass your credentials to it so the azure VM-s
# will be setup to allow ssh connection requests with your public key
eval `ssh-agent -s`
ssh-add
# 1 # start valgrind test
# create valgrind instance to run
export RESOURCE_GROUP_NAME='your-valgrind-test-rg-name-here'
export VALGRIND_TEST=1
cd azure
./create-cluster.sh
# connect to coordinator
./connect.sh
# run fab command in coordinator in a detachable session
#
# Note that you can use any valid schedule name for regression, isolation or failure tests here
tmux new -d "fab use.postgres 15.2 use.citus release-11.1 run.valgrind multi_1_schedule"
# simply exit from coordinator after detaching
# 2 # finalize valgrind test
# reconnect to coordinator after 9.5 hours (if you preferred default coordinator configuration)
export RESOURCE_GROUP_NAME='your-valgrind-test-rg-name-here'
eval `ssh-agent -s`
ssh-add
cd azure
./connect.sh
# you can first check if valgrind test is finished by attaching to tmux session
tmux a
# then you should detach from the session before moving forward
Ctrl+b d
# run push results script
cd test-automation/azure
./push-results.sh <branch name you prefer to push results>
# simply exit from coordinator after pushing the results
# delete resource group finally
cd azure
./delete-resource-group.sh
DETAILS:
To create a valgrind instance, following the steps in Setup Steps For Each Test, do the following before executing create-cluster.sh
:
eval `ssh-agent -s`
ssh-add
export VALGRIND_TEST=1
, which makes numberOfWorkers
setting useless.
This is because we will already be using our regression test structure and it creates a local cluster
itself. Also, as we install valgrind
only on coordinator, if we have worker nodes, then we cannot build
PostgreSQL as we require valgrind
on workers and get error even if we do not need them.
Also, the create-cluster.sh
uses the first public key it finds in the ssh-agent to setup the ssh authentication
for the Azure VM-s so if the ssh-agent is not up or it doesn't have your credentials, you won't be able to ssh
into the VM-s.
On the coordinator node:
# an example usage: Use PostgreSQL 15.2 and run enterprise failure tests with valgrind support on citus/release-11.1
fab use.postgres 15.2 use.citus release-11.1 run.valgrind enterprise_failure_schedule
However as valgrind tests take too much time to complete, we recommend you to run valgrind tests in a detached session:
# Note that you can use any valid schedule name for regression, isolation or failure tests here
tmux new -d "fab use.postgres 15.2 use.citus release-11.1 run.valgrind multi_1_schedule"
After the tests are finished (takes up to 9 hours with default coordinator size), re-connect to the coordinator.
Result can be found under $HOME/results
directory.
To push the results to release_test_results
repository, run the below command in coordinator node:
sh $HOME/test-automation/azure/push-results.sh <branch_name_to_push>
Finally, delete your resource group.
Use fab --list
to see all the tasks you can run! This is just a few examples.
Once you have a cluster you can use many different variations of the "fab" command to install Citus:
fab --list
will return a list of the tasks you can run.fab setup.basic-testing
, will create a vanilla cluster with postgres and citus. Once this has run you can simply run psql
to connect to it.fab use.citus v7.1.1 setup.basic-testing
will do the same, but use the tag v7.1.1
when installing Citus. You can give it any git ref, it defaults to master
.fab use.postgres 10.1 setup.basic-testing
lets you choose your postgres version.fab use.citus release-9.2 setup.citus
will install postgres and the release-9.2
branch of the citus repo.When you run a command like fab use.citus v7.1.1 setup.basic-testing
you are running two
different tasks: use.citus
with a v7.1.1
argument and setup.basic-testing
. Those
tasks are always executed from left to right, and running them is usually equivalent to
running them as separate commands. For example:
# this command:
fab setup.basic-testing add.tpch
# has exactly the same effect as this series of commands:
fab setup.basic-testing
fab add.tpch
An exception is the use
namespace, tasks such as use.citus
and use.postgres
only
have an effect on the current command:
# this works:
fab use.citus v7.1.1 setup.basic-testing
# this does not work:
fab use.citus v7.1.1 # tells fabric to install v7.1.1, but only works during this command
fab setup.basic-testing # will install the master branch of citus
use
tasks must come before setup
tasks:
# this does not work!
# since the `setup` task is run before the `use` task the `use` task will have no effect
fab setup.basic-testing use.citus v.7.1.1
Finally, there are tasks, such as the ones in the add
namespace, which asssume a cluster
is already installed and running. They must be run after a setup
task!
use
TasksThese tasks configure the tasks you run after them. When run alone they have no effect. Some examples:
fab use.citus v7.1.1 setup.basic-testing
fab use.citus release-9.2 setup.citus
fab use.debug-mode use.postgres 10.1 use.citus v7.1.1 setup.basic-testing
use.debug-mode
passes the following flags to postges' configure: --enable-debug --enable-cassert CFLAGS="-ggdb -Og -g3 -fno-omit-frame-pointer"
use.asserts
passes --enable-cassert
, it's a subset of use.debug-mode
.
add
TasksIt is possible to add extra extensions and features to a Citus cluster:
fab add.tpch --scale-factor=1 --partition-type=hash
will generate and copy tpch tables.
The default scale factor is 10. The default partition type is reference for nation, region and supplier and hash for remaining. If you set partition type to 'hash' or 'append', all the tables will be created with that partition type.
fab add.session_analytics
will build and install the session_analytics package (see the instructions above for information on how to checkout this private repo)For a complete list, run fab --list
.
As described above, you can run these at the same time as you run setup
tasks:
fab use.citus v7.1.1 setup.citus add.shard_rebalancer
does what you'd expect.pg
TasksThese tasks run commands which involve the current postgres instance.
fab pg.stop
will stop postgres on all nodesfab pg.restart
will restart postgres on all nodesfab pg.start
guess what this does :)fab pg.read-config [parameter]
will run SHOW [parameter]
on all nodes. For example:fab pg.read-config max_prepared_transactions
If you want to use a literal comma in a command you must escape it (this applies to all fab tasks)
fab pg.set-config shared_preload_libraries 'citus\,cstore_fdw'
Using pg.set-config
it's possible to get yourself into trouble. pg.set-config
uses
ALTER SYSTEM
, so if you've broken your postgres instance so bad it won't boot, you won't
be able to use pg.set-config
to fix it.
To reset to a clean configuration run this command:
fab -- rm pg-latest/data/postgresql.auto.conf
run
TasksIn order to run pgbench and tpch tests automatically, you can use run.pgbench_tests
or run.tpch_automate
. If you want to use default configuration files, running commands without any parameter is enough.
To change configuration file for pgbench tests, you should prepare configuration file similar to fabfile/pgbench_confs/pgbench_config.ini.
To change the configuration file for tpch tests, you should prepare configuration file similar to fabfile/tpch_confs/tpch_default.ini.
By default your fab commands configure the entire cluster, however you can target individual machines.
fab -H 10.0.1.240 pg.start
will start pg on that specific node.You can also ask to run arbitrary commands by adding them after --
.
fab -H 10.0.1.240 -- cat "max_prepared_transactions=0" >> pg-latest/data/postgresql.conf
will modify the postgresql.conf file on the specified worker.fab -- 'cd citus && git checkout master && make install'
to switch the branch of Citus you're using. (This runs on all nodes)pg-latest
Some kinds of tests (such as TPC-H) are easier to perform if you create multiple
simultanious installations of Citus and are able to switch between them. The fabric
scripts allow this by maintaining a symlink called pg-latest
.
Most tasks which interact with a postgres installation (such as add.cstore
or pg.stop
)
simply use the installation in pg-latest
. Tasks such as setup.basic-testing
which
install postgres will overwrite whatever is currently in pg-latest
.
You can change where pg-latest
points by running fab set-pg-latest some-absolute-path
. For
example: fab set-pg-latest $HOME/postgres-installation
. Using multiple
installations is a matter of changing your prefix whenever you want to act upon or create
a different installation.
Here's an example:
fab set-pg-latest $HOME/pg-960-citus-600
fab use.postgres 9.6.0 use.citus v6.0.0 setup.basic-testing
fab set-pg-latest $HOME/pg-961-citus-601
fab use.postgres 9.6.1 use.citus v6.0.1 setup.basic-testing
# you now have 2 installations of Citus!
fab pg.stop # stop the existing Citus instance
fab set-pg-latest $HOME/pg-960-citus-600 # switch to using the new instance
fab pg.start # start the new instance
# now you've switched back to the first installation
# the above can be abbreviated by writing the following:
fab pg.stop set-pg-latest $HOME/pg-960-citus-600 pg.start
Currently test automation has a lot of dependencies such as fabfile, azure and more. In general failures are temporary, which may be as long as a few days(If the problem is on azure service). In that case there is nothing we can do, but sometimes there are other problems that we can fix, and it is useful to try some of the following steps in that case:
Even if a creation of a cluster fails, you can still see the logs and what caused the problem:
ssh pguser@<public_ip>
pguser
doesn't have the access to the logs) sudo su root
/var/lib/waagent/custom-script/download/0
stderr
or stdout
to see what went unexpected.Updating az cli
is also mostly a good option, follow the installation instructions in https://docs.microsoft.com/en-us/cli/azure/install-azure-cli-linux to update your local az cli
installation.
If you suspect if a particular az foo bar
command doesn't work as expected, you could also insert --debug
to have a closer look.
If you're consistently having connection timeout errors (255) when trying to connect to a VM, then consider setting AZURE_REGION
environment variable to eastus
. This error will likely occur due to connection policy issues. As of latest, setting up your VPN properly should fix this issue.
While running on Azure VM-s there might be deployment errors (go to your resource group overview in the portal). This might be caused due to changing
network security policies in Azure. The error message of the deployment failure should show the conflicting policies. You can then go to the azuredeploy.json
file for your test and try to change the priority of the custom policies (search priority in the file) until there are no conflicts.