dtcenter / METdataio

https://metdataio.readthedocs.io/en/latest/index.html
Apache License 2.0
5 stars 3 forks source link

Create additional tests to METdataio to increase code coverage #318

Closed bikegeek closed 2 hours ago

bikegeek commented 1 month ago

To be assigned to John Sharples once he has accepted the invitation to join this repository.

Describe the Task

METdataio, specifically the METdbLoad modules require additional tests to increase the code coverage from its current status. Add appropriate tests, with particular focus on database loading.

Time Estimate

Estimate the amount of work required here. Issues should represent approximately 1 to 3 days of work.

Sub-Issues

Consider breaking the task down into sub-issues.

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

Define the source of funding and account keys here or state NONE.

Define the Metadata

Assignee

Labels

Milestone and Projects

Define Related Issue(s)

Consider the impact to the other METplus components.

Task Checklist

See the METplus Workflow for details.

bikegeek commented 1 month ago

The additional tests contribute towards this issue: https://github.com/dtcenter/METplus-Internal/issues/50

John-Sharples commented 3 weeks ago

Thanks for creating this ticket for me @bikegeek

I've spent some more time reading through the METdbLoad code and propose the following approach for testing.

No datatbase testing

Initially I'd like to write true "unit tests". These are tests that don't require a database and just check function behaviour. Since run_sql.py provides a convenient abstraction layer for all database reads/writes, we can create a mock of RunSql and use this to test all other modules.

Pros:

  1. Tests can run anywhere, without needing MySQL
  2. Avoids the overhead of setting up and managing a test database
  3. Quickest way to increase test coverage

Cons:

  1. Can't write meaningful tests for run_sql.py
  2. Can't test database interactions

Testing with a datatbase

If we decide we want something more robust, we can have the tests operate on a real test database. This would be akin to "integration tests". To do this we'd need MySQL to be running in the test environment, and then write some test fixtures to manage the database state for each test.

Pros:

  1. Tests are closer to real world functionality of METdbLoad
  2. Tests database specific interactions (e.g. local-file configuration, database version)

Cons:

  1. Requires MySQL to be running in test environment
  2. Requires test infrastructure to setup and manage test database
  3. Greater dev effort required

The above approaches are not mutually exclusive. We can start with the no database approach and transition to using a database later on. It's likely we could come up with an implementation that can do either approach depending if MySQL is available. For example, the test infrastructure could use a real database when available, otherwise fall back to the mock RunSql.

Let me know if you're happy with this approach, or have any questions?