Closed noklam closed 1 week ago
The exists() method is not an abstract method and so doesn't have to be implemented. The original SharedMemoryDataset
was wrapped around MemoryDataset
directly so I guess that's why the warning wasn't triggered before https://github.com/kedro-org/kedro/pull/3332/files#diff-cc4e916905ba553d257700ff851973772c0d77018d7673a2c4b22f309d8c3e46. IMO we can just ignore this.
@merelcht Haven't look into this in details, would the alternative be changing that WARNING message to INFO
instead?
Reported again recently: https://github.com/kedro-org/kedro/issues/4060
Updated the scope to double check where exists
is called and if it makes sense to implement this method on SharedMemoryDataset
.
Also related: https://github.com/kedro-org/kedro/issues/4078
Hi, I investigated this for a bit and I believe the issue is that SharedMemoryDataset
delegates most methods to the underlying MemoryDataset
via __getattr__
.
https://github.com/kedro-org/kedro/blob/56e56a783e4b79e96d09c20918a23bb3e11976d1/kedro/io/shared_memory_dataset.py#L33-L37
However, __getattr__
is only called for attributes that do not exist on the object and since SharedMemoryDataset
inherits AbstractDataset._exists
__getattr__
is never called.
We could implement _exists
on SharedMemoryDataset
, we would have to handle the case where self.shared_memory_dataset
is None
(return False
in this case?):
class SharedMemoryDataset(AbstractDataset):
...
def _exists(self) -> bool:
if not self.shared_memory_dataset:
return False
return self.shared_memory_dataset.exists()
I'm happy to submit a PR implementing this:)
Description
Context
Unclear, the
SharedMemoryDataset
is implemented by the core team and generating warning when run withParallelRunner
, is this false alarm or something we need to address? Cc @merelchtSteps to Reproduce
I was testing something else and see the warning with this command:
pytest tests/framework/session/test_session_extension_hooks.py::TestNodeHooks::test_on_node_error_hook_parallel_runner
Possible Solution:
https://github.com/kedro-org/kedro/blob/0fc8089b637a0679f71e2bddc91f0676fc2914a2/kedro/io/memory_dataset.py#L74C1-L75C40 We have a simple check in
MemoryDataset
, we could implement simliar logic forShareMemoryDataset
.Check whether it makes sense to have an
_exists()
method forSharedMemoryDataset
. If not: update the logging level to be ofDEBUG
.