Closed akruszewski closed 4 years ago
@akruszewski Actually, I think it is possible. Do you have an example pipeline/validation that you're using?
@tamsanh Unfortunately I can't share it, But the scenario for most nodes is:
Let me know if you need more info.
@akruszewski I just pushed a new version of the repo. Try doing a pip install -U kedro-great
. You should get 0.2.2
, which will support datasets that do not have a _filepath
attribute.
@tamsanh with your change I'm still getting error. This time:
Traceback (most recent call last):
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/bin/kedro", line 8, in <module>
sys.exit(main())
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/kedro/framework/cli/cli.py", line 633, in main
cli_collection()
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/mnt/c/dev/kedro-and-kubeflow/kedro_cli.py", line 230, in run
pipeline_name=pipeline,
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/kedro/framework/context/context.py", line 699, in run
raise error
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/kedro/framework/context/context.py", line 691, in run
run_result = runner.run(filtered_pipeline, catalog, run_id)
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/kedro/runner/runner.py", line 101, in run
self._run(pipeline, catalog, run_id)
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/kedro/runner/sequential_runner.py", line 90, in _run
run_node(node, catalog, self._is_async, run_id)
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/kedro/runner/runner.py", line 213, in run_node
node = _run_node_sequential(node, catalog, run_id)
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/kedro/runner/runner.py", line 245, in _run_node_sequential
run_id=run_id,
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/pluggy/hooks.py", line 286, in __call__
return self._hookexec(self, self.get_hookimpls(), kwargs)
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/pluggy/manager.py", line 93, in _hookexec
return self._inner_hookexec(hook, methods, kwargs)
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/pluggy/manager.py", line 87, in <lambda>
firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/pluggy/callers.py", line 208, in _multicall
return outcome.get_result()
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/pluggy/callers.py", line 80, in get_result
raise ex[1].with_traceback(ex[2])
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/pluggy/callers.py", line 187, in _multicall
res = hook_impl.function(*args)
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/kedro_great/kedro_great.py", line 88, in after_node_run
self._run_validation(catalog, outputs, run_id)
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/kedro_great/kedro_great.py", line 103, in _run_validation
df = dataset.load()
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/kedro/io/core.py", line 213, in load
return self._load()
File "/home/kruszewa/miniconda3/envs/kedro-and-kubeflow-env/lib/python3.7/site-packages/kedro/io/memory_data_set.py", line 81, in _load
raise DataSetError("Data for MemoryDataSet has not been saved yet.")
kedro.io.core.DataSetError: Data for MemoryDataSet has not been saved yet.
Mine solution for that is to replace dataset.load()
with dataset_value
when _filepath
attribute is None
. I'm still not super familiar with your plugin, so I'm not sure if it would not break anything. Anyway PR: https://github.com/tamsanh/kedro-great/pull/2
@tamsanh do you have plans for support in-memory datasets?
Context
After setup of
kedro-great
for kedro (kedro great init
) and running pipelinekedro run
, I'm getting an error:After fast skim through the repo I figured out (correct me if I'm wrong), that just datesets which have
_filepath
are supported.