UUDigitalHumanitieslab / I-analyzer

The great textmining tool that obviates all others
https://ianalyzer.hum.uu.nl
MIT License
6 stars 2 forks source link

Backend testing: allow postgres database access in elasticsearch indexing fixtures #1533

Closed lukavdplas closed 2 months ago

lukavdplas commented 3 months ago

Currently, indexing operations access Elasticsearch but never the SQL database. However, if we add database-only corpora, that obviously needs to change.

This causes issues for unit tests, however. The way pytest-django works, you can't access the SQL database in a session-scoped test fixture, but our unit tests rely on sesssion-scoped fixtures to create elasticsearch indices for test corpora.

One reason why indexing fixtures are session-scoped is that otherwise, the test time really builds up. (On my machine, I think it goes from 2 minutes to 4-5 minutes.) This is partly because the operations take time, and partly because ES is near-real-time, so you need a "buffer period" after populating an index.

We should find a solution for indexing fixtures that allows them to access the database.

Proposed solution:

Alternatives: