Closed Trinaa closed 2 weeks ago
@Trinaa Thanks for exploring this!
I tried it out locally on my 10-core machine w/ various configurations, looks like they all trailed behind the sequential mode by about 20–30% w/ some additional warnings and even failures. My understanding is that the overhead of running tests in parallel is still much higher than the gain given the current unit-test workloads of Merino.
I'd recommend putting this on hold for now, we can revisit this in the future, i.e. once the unit tests grow significantly large to materialize the benefit.
WDYT?
Sequential: ~10 secs w/o warnings
Parallel w/ 1 worker: ~15 secs w/ 2 warnings
Parallel w/ 2 workers: ~12 secs w/ 1 failure & 3 warnings
Parallel w/ 5 workers: ~12 secs w/ 6 warnings
Parallel w/ 10 workers: ~14 secs w/ 11 warnings
@Trinaa Thanks for exploring this!
I tried it out locally on my 10-core machine w/ various configurations, looks like they all trailed behind the sequential mode by about 20–30% w/ some additional warnings and even failures. My understanding is that the overhead of running tests in parallel is still much higher than the gain given the current unit-test workloads of Merino.
I'd recommend putting this on hold for now, we can revisit this in the future, i.e. once the unit tests grow significantly large to materialize the benefit.
WDYT?
Sequential: ~10 secs w/o warnings Parallel w/ 1 worker: ~15 secs w/ 2 warnings Parallel w/ 2 workers: ~12 secs w/ 1 failure & 3 warnings Parallel w/ 5 workers: ~12 secs w/ 6 warnings Parallel w/ 10 workers: ~14 secs w/ 11 warnings
@ncloudioj That's interesting. The results differ greatly between machines, auto
doesn't seem to be super optimized.
On CI, the performance right now seems equivalent.
I have another concern and motivation for going parallel, at least on CI, which is that I found isolation issues with the tests. By keeping the parallel we get feedback on this problem and don't build technical debt. I'm concerned that you saw failures, are there more isolation issues perhaps?
I'm concerned that you saw failures, are there more isolation issues perhaps?
Yes, I believe it's an isolation issue w/ logging (see details below). I am more interested in running integration tests in parallel if we can sort out the isolation issue cuz IT takes more resources and longer to run compared to UT. Though, off hand, I am not sure how bad the isolation hurdle is for IT. Would like pick your brain on that :)
@pytest.mark.asyncio
async def test_query_error(caplog: LogCaptureFixture, keywords: dict[SupportedAddon, set[str]]):
"""Test that provider can handle query error."""
provider = AddonsProvider(
backend=AmoErrorBackend(),
keywords=keywords,
name="addons",
score=0.3,
min_chars=4,
)
await provider.initialize()
req = SuggestionRequest(query="dictionary", geolocation=Location())
suggestions = await provider.query(req)
assert suggestions == []
> assert len(caplog.messages) == 1
E assert 4 == 1
E + where 4 = len(["Task was destroyed but it is pending!\ntask: <T....
tests/unit/providers/amo/test_provider.py:167: AssertionError
okay @ncloudioj, I removed the parallel execution for now, but I'd still like to push through the unit test isolation fixes. Sound okay?
References
JIRA: DISCO-3049
Description
test_favicon_downloader_handles_exception
by setting the logging leveltest_normalize_query
by fixing parameterize syntaxsession_id_context
between testsPR Review Checklist
Put an
x
in the boxes that apply[DISCO-####]
, and has the same title (if applicable)[load test: (abort|warn)]
keywords are applied (if applicable)