mozilla-services / merino-py

Web Service for Firefox Suggest
Mozilla Public License 2.0
9 stars 9 forks source link

[DISCO-3049] Fix isolation issues in unit tests #680

Closed Trinaa closed 2 weeks ago

Trinaa commented 3 weeks ago

References

JIRA: DISCO-3049

Description

PR Review Checklist

Put an x in the boxes that apply

ncloudioj commented 3 weeks ago

@Trinaa Thanks for exploring this!

I tried it out locally on my 10-core machine w/ various configurations, looks like they all trailed behind the sequential mode by about 20–30% w/ some additional warnings and even failures. My understanding is that the overhead of running tests in parallel is still much higher than the gain given the current unit-test workloads of Merino.

I'd recommend putting this on hold for now, we can revisit this in the future, i.e. once the unit tests grow significantly large to materialize the benefit.

WDYT?

Sequential: ~10 secs w/o warnings

Parallel w/ 1 worker: ~15 secs w/ 2 warnings
Parallel w/ 2 workers: ~12 secs w/ 1 failure & 3 warnings
Parallel w/ 5 workers: ~12 secs w/ 6 warnings
Parallel w/ 10 workers: ~14 secs w/ 11 warnings
Trinaa commented 3 weeks ago

@Trinaa Thanks for exploring this!

I tried it out locally on my 10-core machine w/ various configurations, looks like they all trailed behind the sequential mode by about 20–30% w/ some additional warnings and even failures. My understanding is that the overhead of running tests in parallel is still much higher than the gain given the current unit-test workloads of Merino.

I'd recommend putting this on hold for now, we can revisit this in the future, i.e. once the unit tests grow significantly large to materialize the benefit.

WDYT?

Sequential: ~10 secs w/o warnings

Parallel w/ 1 worker: ~15 secs w/ 2 warnings
Parallel w/ 2 workers: ~12 secs w/ 1 failure & 3 warnings
Parallel w/ 5 workers: ~12 secs w/ 6 warnings
Parallel w/ 10 workers: ~14 secs w/ 11 warnings

@ncloudioj That's interesting. The results differ greatly between machines, auto doesn't seem to be super optimized. On CI, the performance right now seems equivalent.

I have another concern and motivation for going parallel, at least on CI, which is that I found isolation issues with the tests. By keeping the parallel we get feedback on this problem and don't build technical debt. I'm concerned that you saw failures, are there more isolation issues perhaps?

ncloudioj commented 3 weeks ago

I'm concerned that you saw failures, are there more isolation issues perhaps?

Yes, I believe it's an isolation issue w/ logging (see details below). I am more interested in running integration tests in parallel if we can sort out the isolation issue cuz IT takes more resources and longer to run compared to UT. Though, off hand, I am not sure how bad the isolation hurdle is for IT. Would like pick your brain on that :)

@pytest.mark.asyncio
    async def test_query_error(caplog: LogCaptureFixture, keywords: dict[SupportedAddon, set[str]]):
        """Test that provider can handle query error."""
        provider = AddonsProvider(
            backend=AmoErrorBackend(),
            keywords=keywords,
            name="addons",
            score=0.3,
            min_chars=4,
        )
        await provider.initialize()

        req = SuggestionRequest(query="dictionary", geolocation=Location())
        suggestions = await provider.query(req)
        assert suggestions == []

>       assert len(caplog.messages) == 1
E       assert 4 == 1
E        +  where 4 = len(["Task was destroyed but it is pending!\ntask: <T....

tests/unit/providers/amo/test_provider.py:167: AssertionError
Trinaa commented 3 weeks ago

okay @ncloudioj, I removed the parallel execution for now, but I'd still like to push through the unit test isolation fixes. Sound okay?