surfedushare / harvester

ETL pipeline and search engine for Edusources and Publinova
MIT License
2 stars 0 forks source link

Add integration tests with Celery background tasks #283

Open fako opened 11 months ago

fako commented 11 months ago

This old Celery test package describes the problem succinctly.

"Writing (integration) tests that depend on Celery tasks is problematic. When you manually run a Celery worker together with your tests, it runs in a separate process and there's no clean way to address objects targeted by Celery from your tests. When you use a separate test database (as with Django for example), you'll have to duplicate configuration code so your Celery worker accesses the same database."

Unfortunately that 9 year old package only works for Celery 3.x and we're currently at 5.x, where 4.x has a major settings refactor that's going to be a problem for the old package.

There is a modern package, but the Django integration with that package is limited. It starts a worker connected to the wrong database. Apart from that we won't be able to patch/mock parts of the application. And the package is very much still under development, with the maintainer indicating he's "taking his time" for three years now. That wouldn't be a problem if parts of the package do not work as advertised.

Instead of writing true integration tests we'll be writing more unit test that handle edge cases.

Nusnus commented 10 months ago

Hello @fako,

I came across your post here just now, and it seems you’ve gotten the wrong impression. Allow me to please clarify some details.

The current celery testing infrastructure a.k.a v0.0.0, is a legacy codebase that wasn’t maintained and is very limiting in the testing capabilities it provides. The new modern package as you mentioned, is developed to allow testing complicated environments using docker containers where the plugin takes care of configuring everything, and the test case only focuses on the test scenario. The new version, v1.0.0, will be accompanied by a new layer of smoke tests in the celery repo itself, which will act both as actual tests and an introduction to the new library.

Regarding the development, <v1.0.0 is only a bridge project for the codebase that sits inside the celery repo itself. The new version, which was started March 2023, is approaching its beta phase and was developed as a new project with less than 1y of development per the GitHub commit log: CleanShot 2024-01-02 at 02 33 46@2x

P.S I got my eyes on Django 👀

fako commented 6 months ago

There is now a new release for pytest-celery that we should try to use for our integration testing.