[DISCO-3049] Fix isolation issues in unit tests

Trinaa commented 3 weeks ago

References

JIRA: DISCO-3049

Description

~add pytest-xdist in order to run unit tests in parallel~
fix isolation issue in test_favicon_downloader_handles_exception by setting the logging level
fix isolation issues in test_normalize_query by fixing parameterize syntax
fix isolation issues in feature flag tests by resetting session_id_context between tests

PR Review Checklist

Put an x in the boxes that apply

[x] This PR conforms to the Contribution Guidelines
[x] The PR title starts with the JIRA issue reference, format [DISCO-####], and has the same title (if applicable)
[x] [load test: (abort|warn)] keywords are applied (if applicable)
[x] Documentation has been updated (if applicable)
[x] Functional and performance test coverage has been expanded and maintained (if applicable)

ncloudioj commented 3 weeks ago

@Trinaa Thanks for exploring this!

I tried it out locally on my 10-core machine w/ various configurations, looks like they all trailed behind the sequential mode by about 20–30% w/ some additional warnings and even failures. My understanding is that the overhead of running tests in parallel is still much higher than the gain given the current unit-test workloads of Merino.

I'd recommend putting this on hold for now, we can revisit this in the future, i.e. once the unit tests grow significantly large to materialize the benefit.

WDYT?

Sequential: ~10 secs w/o warnings

Parallel w/ 1 worker: ~15 secs w/ 2 warnings
Parallel w/ 2 workers: ~12 secs w/ 1 failure & 3 warnings
Parallel w/ 5 workers: ~12 secs w/ 6 warnings
Parallel w/ 10 workers: ~14 secs w/ 11 warnings

Trinaa commented 3 weeks ago

@Trinaa Thanks for exploring this!

I tried it out locally on my 10-core machine w/ various configurations, looks like they all trailed behind the sequential mode by about 20–30% w/ some additional warnings and even failures. My understanding is that the overhead of running tests in parallel is still much higher than the gain given the current unit-test workloads of Merino.

I'd recommend putting this on hold for now, we can revisit this in the future, i.e. once the unit tests grow significantly large to materialize the benefit.

WDYT?
Sequential: ~10 secs w/o warnings

Parallel w/ 1 worker: ~15 secs w/ 2 warnings
Parallel w/ 2 workers: ~12 secs w/ 1 failure & 3 warnings
Parallel w/ 5 workers: ~12 secs w/ 6 warnings
Parallel w/ 10 workers: ~14 secs w/ 11 warnings

@ncloudioj That's interesting. The results differ greatly between machines, auto doesn't seem to be super optimized. On CI, the performance right now seems equivalent.

I have another concern and motivation for going parallel, at least on CI, which is that I found isolation issues with the tests. By keeping the parallel we get feedback on this problem and don't build technical debt. I'm concerned that you saw failures, are there more isolation issues perhaps?

ncloudioj commented 3 weeks ago

I'm concerned that you saw failures, are there more isolation issues perhaps?

Yes, I believe it's an isolation issue w/ logging (see details below). I am more interested in running integration tests in parallel if we can sort out the isolation issue cuz IT takes more resources and longer to run compared to UT. Though, off hand, I am not sure how bad the isolation hurdle is for IT. Would like pick your brain on that :)

@pytest.mark.asyncio
    async def test_query_error(caplog: LogCaptureFixture, keywords: dict[SupportedAddon, set[str]]):
        """Test that provider can handle query error."""
        provider = AddonsProvider(
            backend=AmoErrorBackend(),
            keywords=keywords,
            name="addons",
            score=0.3,
            min_chars=4,
        )
        await provider.initialize()

        req = SuggestionRequest(query="dictionary", geolocation=Location())
        suggestions = await provider.query(req)
        assert suggestions == []

>       assert len(caplog.messages) == 1
E       assert 4 == 1
E        +  where 4 = len(["Task was destroyed but it is pending!\ntask: <T....

tests/unit/providers/amo/test_provider.py:167: AssertionError

Trinaa commented 3 weeks ago

okay @ncloudioj, I removed the parallel execution for now, but I'd still like to push through the unit test isolation fixes. Sound okay?

mozilla-services / merino-py