BritishGeologicalSurvey / etlhelper

ETL Helper is a Python ETL library to simplify data transfer into and out of databases.
https://britishgeologicalsurvey.github.io/etlhelper/
GNU Lesser General Public License v3.0
99 stars 25 forks source link

Run PostgreSQL integration tests in CI #177

Closed volcan01010 closed 10 months ago

volcan01010 commented 10 months ago

Summary

As an ETL Helper developer, I want the PostgreSQL test suite to run in the CI pipelines so that external contributions can easily be validated.

Description

The test suite for ETL Helper includes integration tests for SQLite, PostgreSQL, MS SQL Server and Oracle. The MS SQL Server and Oracle tests require a licenced database to run against, so they are run internally at BGS. However, the PostgreSQL and SQLite tests do not. The majority of the integration tests are written against PostgreSQL and these cover the application logic. Tests for other databases confirm the database drivers and differences in configuration.

If we can run the PostgreSQL tests in the GitHub CI, that would allow us to check code at the level of a merge request, including code submissions from other contributors. This would greatly reduce the chance of a merge request passing the GitHub CI but failing internally.

This GitHub manual explains how to set up PostgreSQL to run in a Docker container in GitHub actions: https://docs.github.com/en/actions/using-containerized-services/creating-postgresql-service-containers

The PostgreSQL container version, database name and other credentials that are required for the tests for our internal runs are defined in our internal .gitlab-ci.yml and CI environment variables.

Acceptance criteria

CI test suite includes the following test folders:

leorudczenko commented 10 months ago

I believe I've managed to get the workflow to work with a postgres image, here is the latest test run:

https://github.com/BritishGeologicalSurvey/etlhelper/actions/runs/5965566004/job/16183282865

Many of the tests are hitting an ERROR because the environment variables are not set in the GitHub workflow: etlhelper.exceptions.ETLHelperConnectionError: Password environment variable (TEST_PG_PASSWORD) is not set

I do not have permissions to modify these, @volcan01010 can you update these appropriately?

Moreover, since we are including sqlite3 tests, this error has come up again too: etlhelper.exceptions.ETLHelperConnectionError: Could not import sqlite3 module required for SQLite connections. Check Python configuration - this should be part of Standard Library.

This stackoverflow post is the closest thing I have found to our sqlite3 problem, but it isn't what we need I don't think: https://stackoverflow.com/questions/72165451/pytest-is-failing-on-github-actions-but-succeeds-locally

Some answers to this post may be more relevant: https://stackoverflow.com/questions/19530974/how-can-i-add-the-sqlite3-module-to-python

volcan01010 commented 10 months ago

For the SQLite issue, we may need to just skip the SQLite tests for now. The PostgreSQL ones cover more of the logic anyway.

For the PostgreSQL, we can hard-code the environment variables into the CI config YAML. It is only a disposable test database so we don't need to keep them secret. The values are:

env:
  POSTGRES_DB: etlhelper
  POSTGRES_USER: etlhelper_user
  POSTGRES_PASSWORD: etlhelper_pw
  POSTGRES_HOST: postgis

For the PostGIS container, we should specify this version: postgis/postgis:15-3.4

leorudczenko commented 10 months ago

Well we've progressed from one issue to another! Now most of the tests are passing, but many are failing due to this:

etlhelper.exceptions.ETLHelperConnectionError: Error connecting to DbParams(host='localhost', port='5432', dbname='etlhelper', user='etlhelper_user', dbtype='PG') via dbapi: connection to server at "localhost" (::1), port 5432 failed: Connection refused
    Is the server running on that host and accepting TCP/IP connections?

You can see the workflow run here: https://github.com/BritishGeologicalSurvey/etlhelper/actions/runs/5977922960/job/16218951792

volcan01010 commented 10 months ago

Closing this as has been merged into for_v1.