Closed noklam closed 3 weeks ago
We discussed this in private, the conclusion is that it's exactly the same issue users report before. https://github.com/kedro-org/kedro/pull/4093/ attempts to fix this, but it only fix the path where user start a KedroSession, i.e. kedro run
. This is why it's breaking for the benchmark tests, and the unit tests that users created.
We agreed the temporary fix should goes into ThreadRunner
. For longer term, that may goes into catalog
instead. In parallel there are needs for listing catalog with pattern, so it's something we need to consider for Catalog & Runner re-design:
Based on the solution proposed for lazy loading https://github.com/kedro-org/kedro/issues/3935#issuecomment-2433144639 we suggest moving the warm-up to the AbstractRunner
before we call _run()
and making this logic common for all runners - https://github.com/kedro-org/kedro/blob/a5d9bb40380c598bf7d03cb16623026892844ed4/kedro/runner/runner.py#L115
Further, we replace this logic with lazy loading warm-up which will be common for all the runners as well.
Description
Originated from https://github.com/kedro-org/kedro/pull/4210
Context
Upon investigation, I found that this error seems to be related to dataset factory pattern only.
The current conclusion is that this is not an error introduced recently. Though there seems to be partial fix previously but it doesn't works for my test case.
Related:
4007
Steps to Reproduce
Using a similar test written in https://github.com/kedro-org/kedro/pull/4210 from
benchmark_runner.py
for ThreadRunner.This is the snippet that I use:
Run this multiple times to confirm it fails (non-deterministic fail due to race condition). Then uncomment the
dummy_x
dataset to pre-register it, now it always pass.Expected Result
Actual Result
Your Environment
pip show kedro
orkedro -V
):python -V
):