Integration tests for advanced_search

garethsime commented 2 months ago

I wanted to set a precedent for how to write search tests, so here are two tests that:

Test that we can get an empty result
Test that we can do a very simple title search

These aren't really "unit" tests in that they do a full test of everything from the controllers, to the models, to the views. I might try writing some more granular tests in the future, but, for now, something is better than nothing.

Another problem with these tests is that we don't reset the database between tests or anything, so it could pollute peoples local databases. (It doesn't bother me, I reset my db all the time, but others might not.)

redrun45 commented 2 months ago

The failed check would appear to be from a temporary error in the CI environment (failed to download a file in "TASK [acf_free : Install ACF free]"), rather than your new test. The test runs just fine over here!

If it's not much more work, it might be nice to clear these test projects once they've been found in the search. I tend to keep my database intact for a while, making projects that demonstrate the changes between the branches I'm working with. With that said, I don't think these generated projects are likely to get in my way. 😄

notartom commented 2 months ago

While I agree with @redrun45's point about not cleaning up after ourselves, just the fact that these tests are being proposed is awesome, and I'm tempted to merge them as is because something is better than nothing, and we can always add the cleanup in a later PR. But yeah, leaking state isn't great. @garethsime as the author, do you have a preference?

garethsime commented 2 months ago

I have three competing options in my head.

Option - Don't worry about it, just pollute the database

This is the easiest to get started with, but it can cause weird behaviour for people testing stuff locally. We don't enforce a lot of business rules at the model level, so it's easy to create entities that only work for your tests and completely explode when you try to do anything else with them (e.g. browse the UI).

This option also takes a little more care when writing the tests - you need to randomise IDs on every run or you risk having conflicts bleeding between tests. For example, in the first version of these tests, I set $unique_string to a fixed string, which worked great initially, but failed after the 25th run because the pagination changed 🙂

I tend to keep my database intact for a while, making projects that demonstrate the changes between the branches I'm working with.

Yeah, and I would guess you're not the only one, so I don't want to break that workflow too much. These scenarios could be rewritten as tests, but I respect that it can be super inconvenient depending on what changes you're making. (E.g. I find it hard to write tests that check certain HTML elements have the right structure or whatever. I guess you could write a temporary test class to set up the data and then rely on a manual verification for actually checking on each branch? Dunno.)

Maybe we can prefix everything with test- so it's easy to delete manually at least.

Option - Try clean up after every test

This one is tricky too, you have to make sure you delete everything you added, and if tests fail then clean-up jobs often end up in an uncertain state, so I don't see it as being a very reliable way to write things. One upside is that it's easy to look at a single file and see what is and isn't being cleaned up.

Option - Keep a separate database

We could have a global setup in the tests that sets up a separate librivox_catalog_test database that has the same structure as the librivox_catalog database. After each test, we TRUNCATE every table without mercy.

It should be pretty doable to run one config for development and one for testing, but the tricky bit will be getting the structure to be the same between the two. I was thinking I might be able to do some kind of mysqldump --no-data librivox_catalog and then apply that straight to librivox_catalog_test, but it'll take some toying around to get it right I suspect.

What to choose?

Keeping a separate database is my favourite option, but I won't have time for it this weekend. I don't know how urgently people want these example tests, but we've done without them for, well, ever, so I'm not in a huge rush personally.

notartom commented 2 months ago

I believe the last option is the way it's normally done.

LV is slightly weird in that there's no easy way to set up with a valid but empty database. Our code doesn't have "DB init" functionality, it just assumes that a database is there. I'm not even sure what such a database would look like? I guess everything can be empty except for one admin user?

If we had such a thing (either as a manually-created .sql dump of an empty database, or as some PHP code that creates it for us), we could have a database fixture in our tests that uses something like SQLite to set up and tear down a database instance for every test.

garethsime commented 3 weeks ago

I've had a wee go here: https://github.com/garethsime/librivox-catalog/tree/advanced-search-high-level-tests-with-database-reset

It works by setting up another database schema called librivox_catalog_test based on the schema from librivox_catalog, but it involves running shell commands and stuff that I don't know will exist in the test image/other devs machines, so I think it needs some reworking. Progress, though.

garethsime commented 3 weeks ago

I had a go using SQLite instead this morning. In terms of setup, it's very easy to get started with -- you just point your database config at the SQLite file and everything works.

The niggly bit is that SQLite3 and MariaDB aren't compatible in some places. This makes things harder in two ways:

Devs will have to write queries that both databases support, which can lead to ugly/inefficient outcomes
If someone writes something SQLite-specific, then the tests would still pass even though it wouldn't actually work (and vice versa)

Concretely, the syntax that didn't work when I was testing the search was:

TRUNCATE TABLE - We truncate the search table before repopulating it. It can be replaced with a DELETE FROM search_table, but it'd be slower than a TRUNCATE in production. (Maybe that's fine? It's on a cron job anyway.)
CONCAT - String concatenation uses || in SQLite3. MariaDB can be configured to use pipes, but we'd have to check which other queries might be using it in case they're relying on the old behavior.
ORDER BY FIELD(...) - SQLite3 doesn't support the FIELD function, though we should be able to replace it using a CASE statement
" vs. ' - MariaDB supports using double-quotes for string literals. SQLite tries to be compatible with this, with the caveat that if the literal is actually a valid identifier, then it'll use that instead. E.g. selecting "title" as blade was supposed to select the literal string title as the blade column, but in SQLite3 it treats it as an identifier and fills in the value from the title column.

All up, I'm not convinced that using SQLite for this is going to be a good thing.

I've pushed the branch here, so have a look and see what you make of it if you're still interested :slightly_smiling_face: https://github.com/garethsime/librivox-catalog/pull/new/advanced-search-high-level-tests-with-sqlite (I haven't tidied it up at all, but it's a proof-of-concept.)

LibriVox / librivox-catalog