kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.53k stars 877 forks source link

`_cli_hook_manager.enable_tracing()` causing error on some runs #2630

Open melvinkokxw opened 1 year ago

melvinkokxw commented 1 year ago

Description

kedro run fails if an object used in the run doesn't have __repr__ implemented properly. Seems to be caused by the addition of _cli_hook_manager.enable_tracing() in kedro==0.18.9: https://github.com/kedro-org/kedro/compare/0.18.8...0.18.9#diff-2547f8676f08bd22f65160f0f04f27ca730e0e3f49b8d9f44ec3c7edf387fc6eR23

The same pipeline works on 0.18.8, but breaks on 0.18.9.

Our hypothesis is that when using node-related hooks, it tries to generate string representation of the node input/outputs due to _cli_hook_manager.enable_tracing(), and throws the error.

Context

We were running a pipeline with a custom scikit-learn transformer. scikit-learn version is very old (0.24.2) and is likely to be the reason why __repr__ wasn't working correctly

Steps to Reproduce

  1. Create object that does not have __repr__ properly implemented
  2. Use object in node input/output
  3. Use hook that involves node (e.g. after_node_run)

Expected Result

Pipeline should run successfully

Actual Result

An error is thrown

Expand this for error message ``` 2023-06-01T11:06:49.1709611Z ╭───────────────────── Traceback (most recent call last) ──────────────────────╮ 2023-06-01T11:06:49.1710506Z │ /home/vsts/work/1/s/venv/bin/kedro:8 in │ 2023-06-01T11:06:49.1711026Z │ │ 2023-06-01T11:06:49.1712133Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/kedro/framework/cli/cli │ 2023-06-01T11:06:49.1712806Z │ .py:211 in main │ 2023-06-01T11:06:49.1713297Z │ │ 2023-06-01T11:06:49.1713827Z │ 208 │ """ │ 2023-06-01T11:06:49.1714337Z │ 209 │ _init_plugins() │ 2023-06-01T11:06:49.1717307Z │ 210 │ cli_collection = KedroCLI(project_path=Path.cwd()) │ 2023-06-01T11:06:49.1719965Z │ ❱ 211 │ cli_collection() │ 2023-06-01T11:06:49.1720384Z │ 212 │ 2023-06-01T11:06:49.1720773Z │ │ 2023-06-01T11:06:49.1721237Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/click/core.py:1130 in │ 2023-06-01T11:06:49.1721712Z │ __call__ │ 2023-06-01T11:06:49.1722103Z │ │ 2023-06-01T11:06:49.1722863Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/kedro/framework/cli/cli │ 2023-06-01T11:06:49.1723361Z │ .py:139 in main │ 2023-06-01T11:06:49.1723782Z │ │ 2023-06-01T11:06:49.1724152Z │ 136 │ │ ) │ 2023-06-01T11:06:49.1724545Z │ 137 │ │ │ 2023-06-01T11:06:49.1724936Z │ 138 │ │ try: │ 2023-06-01T11:06:49.1725338Z │ ❱ 139 │ │ │ super().main( │ 2023-06-01T11:06:49.1725766Z │ 140 │ │ │ │ args=args, │ 2023-06-01T11:06:49.1726200Z │ 141 │ │ │ │ prog_name=prog_name, │ 2023-06-01T11:06:49.1726678Z │ 142 │ │ │ │ complete_var=complete_var, │ 2023-06-01T11:06:49.1727097Z │ │ 2023-06-01T11:06:49.1727681Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/click/core.py:1055 in │ 2023-06-01T11:06:49.1728153Z │ main │ 2023-06-01T11:06:49.1728538Z │ │ 2023-06-01T11:06:49.1730972Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/click/core.py:1657 in │ 2023-06-01T11:06:49.1731405Z │ invoke │ 2023-06-01T11:06:49.1731745Z │ │ 2023-06-01T11:06:49.1732128Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/click/core.py:1404 in │ 2023-06-01T11:06:49.1732534Z │ invoke │ 2023-06-01T11:06:49.1732888Z │ │ 2023-06-01T11:06:49.1733274Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/click/core.py:760 in │ 2023-06-01T11:06:49.1734035Z │ invoke │ 2023-06-01T11:06:49.1734392Z │ │ 2023-06-01T11:06:49.1734769Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/kedro/framework/cli/pro │ 2023-06-01T11:06:49.1735196Z │ ject.py:459 in run │ 2023-06-01T11:06:49.1735553Z │ │ 2023-06-01T11:06:49.1735902Z │ 456 │ with KedroSession.create( │ 2023-06-01T11:06:49.1736323Z │ 457 │ │ env=env, conf_source=conf_source, extra_params=params │ 2023-06-01T11:06:49.1736744Z │ 458 │ ) as session: │ 2023-06-01T11:06:49.1737098Z │ ❱ 459 │ │ session.run( │ 2023-06-01T11:06:49.1737478Z │ 460 │ │ │ tags=tag, │ 2023-06-01T11:06:49.1737867Z │ 461 │ │ │ runner=runner(is_async=is_async), │ 2023-06-01T11:06:49.1738250Z │ 462 │ │ │ node_names=node_names, │ 2023-06-01T11:06:49.1738610Z │ │ 2023-06-01T11:06:49.1739013Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/kedro/framework/session │ 2023-06-01T11:06:49.1739430Z │ /session.py:425 in run │ 2023-06-01T11:06:49.1739789Z │ │ 2023-06-01T11:06:49.1740422Z │ 4*** │ │ ) │ 2023-06-01T11:06:49.1740756Z │ 423 │ │ │ 2023-06-01T11:06:49.1741108Z │ 424 │ │ try: │ 2023-06-01T11:06:49.1741466Z │ ❱ 425 │ │ │ run_result = runner.run( │ 2023-06-01T11:06:49.1741858Z │ 426 │ │ │ │ filtered_pipeline, catalog, hook_manager, session_id │ 2023-06-01T11:06:49.1742254Z │ 427 │ │ │ ) │ 2023-06-01T11:06:49.1742617Z │ 428 │ │ │ self._run_called = True │ 2023-06-01T11:06:49.1742956Z │ │ 2023-06-01T11:06:49.1743359Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/kedro/runner/runner.py: │ 2023-06-01T11:06:49.1743757Z │ 92 in run │ 2023-06-01T11:06:49.1744089Z │ │ 2023-06-01T11:06:49.1744755Z │ 89 │ │ │ self._logger.info( │ 2023-06-01T11:06:49.1745147Z │ 90 │ │ │ │ "Asynchronous mode is enabled for loading and saving d │ 2023-06-01T11:06:49.1745530Z │ 91 │ │ │ ) │ 2023-06-01T11:06:49.1745924Z │ ❱ 92 │ │ self._run(pipeline, catalog, hook_manager, session_id) │ 2023-06-01T11:06:49.1746423Z │ 93 │ │ │ 2023-06-01T11:06:49.1746816Z │ 94 │ │ self._logger.info("Pipeline execution completed successfully." │ 2023-06-01T11:06:49.1747209Z │ 95 │ 2023-06-01T11:06:49.1747523Z │ │ 2023-06-01T11:06:49.1747939Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/kedro/runner/sequential │ 2023-06-01T11:06:49.1748370Z │ _runner.py:70 in _run │ 2023-06-01T11:06:49.1748726Z │ │ 2023-06-01T11:06:49.1749059Z │ 67 │ │ │ 2023-06-01T11:06:49.1749432Z │ 68 │ │ for exec_index, node in enumerate(nodes): │ 2023-06-01T11:06:49.1749787Z │ 69 │ │ │ try: │ 2023-06-01T11:06:49.1750180Z │ ❱ 70 │ │ │ │ run_node(node, catalog, hook_manager, self._is_async, s │ 2023-06-01T11:06:49.1750596Z │ 71 │ │ │ │ done_nodes.add(node) │ 2023-06-01T11:06:49.1750964Z │ 72 │ │ │ except Exception: │ 2023-06-01T11:06:49.1751386Z │ 73 │ │ │ │ self._suggest_resume_scenario(pipeline, done_nodes, cat │ 2023-06-01T11:06:49.1751769Z │ │ 2023-06-01T11:06:49.1752167Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/kedro/runner/runner.py: │ 2023-06-01T11:06:49.1752577Z │ 320 in run_node │ 2023-06-01T11:06:49.1752921Z │ │ 2023-06-01T11:06:49.1753254Z │ 317 │ if is_async: │ 2023-06-01T11:06:49.1753664Z │ 318 │ │ node = _run_node_async(node, catalog, hook_manager, session_id │ 2023-06-01T11:06:49.1754064Z │ 319 │ else: │ 2023-06-01T11:06:49.1754524Z │ ❱ 320 │ │ node = _run_node_sequential(node, catalog, hook_manager, sessi │ 2023-06-01T11:06:49.1754929Z │ 321 │ │ 2023-06-01T11:06:49.1755455Z │ 3*** │ for name in node.confirms: │ 2023-06-01T11:06:49.1755842Z │ 323 │ │ catalog.confirm(name) │ 2023-06-01T11:06:49.1756201Z │ │ 2023-06-01T11:06:49.1768273Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/kedro/runner/runner.py: │ 2023-06-01T11:06:49.1768794Z │ 416 in _run_node_sequential │ 2023-06-01T11:06:49.1769162Z │ │ 2023-06-01T11:06:49.1769508Z │ 413 │ ) │ 2023-06-01T11:06:49.1769882Z │ 414 │ inputs.update(additional_inputs) │ 2023-06-01T11:06:49.1770273Z │ 415 │ │ 2023-06-01T11:06:49.1770687Z │ ❱ 416 │ outputs = _call_node_run( │ 2023-06-01T11:06:49.1771104Z │ 417 │ │ node, catalog, inputs, is_async, hook_manager, session_id=sess │ 2023-06-01T11:06:49.1771719Z │ 418 │ ) │ 2023-06-01T11:06:49.1772058Z │ 419 │ 2023-06-01T11:06:49.1772375Z │ │ 2023-06-01T11:06:49.1772778Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/kedro/runner/runner.py: │ 2023-06-01T11:06:49.1773197Z │ 383 in _call_node_run │ 2023-06-01T11:06:49.1773531Z │ │ 2023-06-01T11:06:49.1773887Z │ 380 │ │ │ session_id=session_id, │ 2023-06-01T11:06:49.1774245Z │ 381 │ │ ) │ 2023-06-01T11:06:49.1774594Z │ 382 │ │ raise exc │ 2023-06-01T11:06:49.1774997Z │ ❱ 383 │ hook_manager.hook.after_node_run( │ 2023-06-01T11:06:49.1775369Z │ 384 │ │ node=node, │ 2023-06-01T11:06:49.1775740Z │ 385 │ │ catalog=catalog, │ 2023-06-01T11:06:49.1776117Z │ 386 │ │ inputs=inputs, │ 2023-06-01T11:06:49.1776450Z │ │ 2023-06-01T11:06:49.1776867Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/pluggy/hooks.py:286 in │ 2023-06-01T11:06:49.1777276Z │ __call__ │ 2023-06-01T11:06:49.1777611Z │ │ 2023-06-01T11:06:49.1777994Z │ 283 │ │ │ │ │ "can not be found in this hook call".format(tuple( │ 2023-06-01T11:06:49.1778398Z │ 284 │ │ │ │ │ stacklevel=2, │ 2023-06-01T11:06:49.1778738Z │ 285 │ │ │ │ ) │ 2023-06-01T11:06:49.1779133Z │ ❱ 286 │ │ return self._hookexec(self, self.get_hookimpls(), kwargs) │ 2023-06-01T11:06:49.1779519Z │ 287 │ │ 2023-06-01T11:06:49.1779905Z │ 288 │ def call_historic(self, result_callback=None, kwargs=None, proc=No │ 2023-06-01T11:06:49.1780349Z │ 289 │ │ """Call the hook with given ``kwargs`` for all registered plug │ 2023-06-01T11:06:49.1780732Z │ │ 2023-06-01T11:06:49.1781128Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/pluggy/manager.py:93 in │ 2023-06-01T11:06:49.1781646Z │ _hookexec │ 2023-06-01T11:06:49.1781986Z │ │ 2023-06-01T11:06:49.1782352Z │ 90 │ def _hookexec(self, hook, methods, kwargs): │ 2023-06-01T11:06:49.1782768Z │ 91 │ │ # called from all hookcaller instances. │ 2023-06-01T11:06:49.1783198Z │ 92 │ │ # enable_tracing will set its own wrapping function at self._i │ 2023-06-01T11:06:49.1783621Z │ ❱ 93 │ │ return self._inner_hookexec(hook, methods, kwargs) │ 2023-06-01T11:06:49.1784000Z │ 94 │ │ 2023-06-01T11:06:49.1784372Z │ 95 │ def register(self, plugin, name=None): │ 2023-06-01T11:06:49.1784779Z │ 96 │ │ """ Register a plugin and return its canonical name or ``None` │ 2023-06-01T11:06:49.1785177Z │ │ 2023-06-01T11:06:49.1785573Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/pluggy/manager.py:334 │ 2023-06-01T11:06:49.1786070Z │ in traced_hookexec │ 2023-06-01T11:06:49.1786639Z │ │ 2023-06-01T11:06:49.1787070Z │ 331 │ │ oldcall = self._inner_hookexec │ 2023-06-01T11:06:49.1787439Z │ 332 │ │ │ 2023-06-01T11:06:49.1787829Z │ 333 │ │ def traced_hookexec(hook, hook_impls, kwargs): │ 2023-06-01T11:06:49.1788241Z │ ❱ 334 │ │ │ before(hook.name, hook_impls, kwargs) │ 2023-06-01T11:06:49.1788673Z │ 335 │ │ │ outcome = _Result.from_call(lambda: oldcall(hook, hook_imp │ 2023-06-01T11:06:49.1789122Z │ 336 │ │ │ after(outcome, hook.name, hook_impls, kwargs) │ 2023-06-01T11:06:49.1789515Z │ 337 │ │ │ return outcome.get_result() │ 2023-06-01T11:06:49.1789891Z │ │ 2023-06-01T11:06:49.1790301Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/pluggy/manager.py:352 │ 2023-06-01T11:06:49.1790695Z │ in before │ 2023-06-01T11:06:49.1791033Z │ │ 2023-06-01T11:06:49.1791364Z │ 349 │ │ │ 2023-06-01T11:06:49.1791723Z │ 350 │ │ def before(hook_name, methods, kwargs): │ 2023-06-01T11:06:49.1792129Z │ 351 │ │ │ hooktrace.root.indent += 1 │ 2023-06-01T11:06:49.1794150Z │ ❱ 352 │ │ │ hooktrace(hook_name, kwargs) │ 2023-06-01T11:06:49.1794752Z │ 353 │ │ │ 2023-06-01T11:06:49.1795149Z │ 354 │ │ def after(outcome, hook_name, methods, kwargs): │ 2023-06-01T11:06:49.1795554Z │ 355 │ │ │ if outcome.excinfo is None: │ 2023-06-01T11:06:49.1795893Z │ │ 2023-06-01T11:06:49.1796316Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/pluggy/_tracing.py:59 │ 2023-06-01T11:06:49.1796709Z │ in __call__ │ 2023-06-01T11:06:49.1797042Z │ │ 2023-06-01T11:06:49.1797393Z │ 56 │ │ self.tags = tags │ 2023-06-01T11:06:49.1797731Z │ 57 │ │ 2023-06-01T11:06:49.1798104Z │ 58 │ def __call__(self, *args): │ 2023-06-01T11:06:49.1798510Z │ ❱ 59 │ │ self.root._processmessage(self.tags, args) │ 2023-06-01T11:06:49.1798964Z │ 60 │ │ 2023-06-01T11:06:49.1799317Z │ 61 │ def get(self, name): │ 2023-06-01T11:06:49.1799719Z │ 62 │ │ return self.__class__(self.root, self.tags + (name,)) │ 2023-06-01T11:06:49.1800083Z │ │ 2023-06-01T11:06:49.1800488Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/pluggy/_tracing.py:34 │ 2023-06-01T11:06:49.1800959Z │ in _processmessage │ 2023-06-01T11:06:49.1801289Z │ │ 2023-06-01T11:06:49.1801631Z │ 31 │ │ 2023-06-01T11:06:49.1801997Z │ 32 │ def _processmessage(self, tags, args): │ 2023-06-01T11:06:49.1802412Z │ 33 │ │ if self._writer is not None and args: │ 2023-06-01T11:06:49.1802823Z │ ❱ 34 │ │ │ self._writer(self._format_message(tags, args)) │ 2023-06-01T11:06:49.1803194Z │ 35 │ │ try: │ 2023-06-01T11:06:49.1803569Z │ 36 │ │ │ processor = self._tags2proc[tags] │ 2023-06-01T11:06:49.1803962Z │ 37 │ │ except KeyError: │ 2023-06-01T11:06:49.1804303Z │ │ 2023-06-01T11:06:49.1804710Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/pluggy/_tracing.py:28 │ 2023-06-01T11:06:49.1805132Z │ in _format_message │ 2023-06-01T11:06:49.1805482Z │ │ 2023-06-01T11:06:49.1805872Z │ 25 │ │ lines = ["%s%s [%s]\n" % (indent, content, ":".join(tags))] │ 2023-06-01T11:06:49.1806246Z │ 26 │ │ │ 2023-06-01T11:06:49.1806599Z │ 27 │ │ for name, value in extra.items(): │ 2023-06-01T11:06:49.1807017Z │ ❱ 28 │ │ │ lines.append("%s %s: %s\n" % (indent, name, value)) │ 2023-06-01T11:06:49.1807408Z │ 29 │ │ │ 2023-06-01T11:06:49.1807747Z │ 30 │ │ return "".join(lines) │ 2023-06-01T11:06:49.1808096Z │ 31 │ 2023-06-01T11:06:49.1808422Z │ │ 2023-06-01T11:06:49.1808917Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/sklearn/base.py:260 in │ 2023-06-01T11:06:49.1809327Z │ __repr__ │ 2023-06-01T11:06:49.1809674Z │ │ 2023-06-01T11:06:49.1810041Z │ 257 │ │ │ compact=True, indent=1, indent_at_name=True, │ 2023-06-01T11:06:49.1810457Z │ 258 │ │ │ n_max_elements_to_show=N_MAX_ELEMENTS_TO_SHOW) │ 2023-06-01T11:06:49.1810832Z │ 259 │ │ │ 2023-06-01T11:06:49.1811191Z │ ❱ 260 │ │ repr_ = pp.pformat(self) │ 2023-06-01T11:06:49.1811567Z │ 261 │ │ │ 2023-06-01T11:06:49.1811957Z │ 262 │ │ # Use bruteforce ellipsis when there are a lot of non-blank ch │ 2023-06-01T11:06:49.1812382Z │ 263 │ │ n_nonblank = len(''.join(repr_.split())) │ 2023-06-01T11:06:49.1812763Z │ │ 2023-06-01T11:06:49.1813235Z │ /opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/pprint.py:153 in │ 2023-06-01T11:06:49.1813646Z │ pformat │ 2023-06-01T11:06:49.1813987Z │ │ 2023-06-01T11:06:49.1814308Z │ 150 │ │ 2023-06-01T11:06:49.1814666Z │ 151 │ def pformat(self, object): │ 2023-06-01T11:06:49.1815047Z │ 152 │ │ sio = _StringIO() │ 2023-06-01T11:06:49.1815420Z │ ❱ 153 │ │ self._format(object, sio, 0, 0, {}, 0) │ 2023-06-01T11:06:49.1815806Z │ 154 │ │ return sio.getvalue() │ 2023-06-01T11:06:49.1816175Z │ 155 │ │ 2023-06-01T11:06:49.1816529Z │ 156 │ def isrecursive(self, object): │ 2023-06-01T11:06:49.1822043Z │ │ 2023-06-01T11:06:49.1822506Z │ /opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/pprint.py:170 in │ 2023-06-01T11:06:49.1822904Z │ _format │ 2023-06-01T11:06:49.1823242Z │ │ 2023-06-01T11:06:49.1823594Z │ 167 │ │ │ self._recursive = True │ 2023-06-01T11:06:49.1823961Z │ 168 │ │ │ self._readable = False │ 2023-06-01T11:06:49.1824329Z │ 169 │ │ │ return │ 2023-06-01T11:06:49.1824733Z │ ❱ 170 │ │ rep = self._repr(object, context, level) │ 2023-06-01T11:06:49.1825136Z │ 171 │ │ max_width = self._width - indent - allowance │ 2023-06-01T11:06:49.1825551Z │ 172 │ │ if len(rep) > max_width: │ 2023-06-01T11:06:49.1825966Z │ 173 │ │ │ p = self._dispatch.get(type(object).__repr__, None) │ 2023-06-01T11:06:49.1826504Z │ │ 2023-06-01T11:06:49.1826916Z │ /opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/pprint.py:404 in _repr │ 2023-06-01T11:06:49.1827312Z │ │ 2023-06-01T11:06:49.1827642Z │ 401 │ │ │ │ │ │ context, level) │ 2023-06-01T11:06:49.1827991Z │ 402 │ │ 2023-06-01T11:06:49.1828557Z │ 403 │ def _repr(self, object, context, level): │ 2023-06-01T11:06:49.1828972Z │ ❱ 404 │ │ repr, readable, recursive = self.format(object, context.copy() │ 2023-06-01T11:06:49.1829425Z │ 405 │ │ │ │ │ │ │ │ │ │ │ │ self._depth, level) │ 2023-06-01T11:06:49.1829789Z │ 406 │ │ if not readable: │ 2023-06-01T11:06:49.1830169Z │ 407 │ │ │ self._readable = False │ 2023-06-01T11:06:49.1830524Z │ │ 2023-06-01T11:06:49.1830914Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/sklearn/utils/_pprint.p │ 2023-06-01T11:06:49.1831328Z │ y:180 in format │ 2023-06-01T11:06:49.1831677Z │ │ 2023-06-01T11:06:49.1832054Z │ 177 │ │ self.n_max_elements_to_show = n_max_elements_to_show │ 2023-06-01T11:06:49.1832441Z │ 178 │ │ 2023-06-01T11:06:49.1832917Z │ 179 │ def format(self, object, context, maxlevels, level): │ 2023-06-01T11:06:49.1833335Z │ ❱ 180 │ │ return _safe_repr(object, context, maxlevels, level, │ 2023-06-01T11:06:49.1833757Z │ 181 │ │ │ │ │ │ changed_only=self._changed_only) │ 2023-06-01T11:06:49.1834125Z │ 182 │ │ 2023-06-01T11:06:49.1834511Z │ 183 │ def _pprint_estimator(self, object, stream, indent, allowance, con │ 2023-06-01T11:06:49.1834907Z │ │ 2023-06-01T11:06:49.1835309Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/sklearn/utils/_pprint.p │ 2023-06-01T11:06:49.1835720Z │ y:425 in _safe_repr │ 2023-06-01T11:06:49.1836088Z │ │ 2023-06-01T11:06:49.1836512Z │ 4*** │ │ readable = True │ 2023-06-01T11:06:49.1836886Z │ 423 │ │ recursive = False │ 2023-06-01T11:06:49.1837258Z │ 424 │ │ if changed_only: │ 2023-06-01T11:06:49.1837657Z │ ❱ 425 │ │ │ params = _changed_params(object) │ 2023-06-01T11:06:49.1838030Z │ 426 │ │ else: │ 2023-06-01T11:06:49.1838418Z │ 427 │ │ │ params = object.get_params(deep=False) │ 2023-06-01T11:06:49.1838814Z │ 428 │ │ components = [] │ 2023-06-01T11:06:49.1839142Z │ │ 2023-06-01T11:06:49.1839596Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/sklearn/utils/_pprint.p │ 2023-06-01T11:06:49.1840023Z │ y:91 in _changed_params │ 2023-06-01T11:06:49.1840375Z │ │ 2023-06-01T11:06:49.1840815Z │ 88 │ """Return dict (param_name: value) of parameters that were given t │ 2023-06-01T11:06:49.1841245Z │ 89 │ estimator with non-default values.""" │ 2023-06-01T11:06:49.1841595Z │ 90 │ │ 2023-06-01T11:06:49.1841984Z │ ❱ 91 │ params = estimator.get_params(deep=False) │ 2023-06-01T11:06:49.1842424Z │ 92 │ init_func = getattr(estimator.__init__, 'deprecated_original', │ 2023-06-01T11:06:49.1842825Z │ 93 │ │ │ │ │ │ estimator.__init__) │ 2023-06-01T11:06:49.1843318Z │ 94 │ init_params = inspect.signature(init_func).parameters │ 2023-06-01T11:06:49.1843702Z │ │ 2023-06-01T11:06:49.1844107Z │ /home/vsts/work/1/s/venv/lib/python3.8/site-packages/sklearn/base.py:195 in │ 2023-06-01T11:06:49.1844521Z │ get_params │ 2023-06-01T11:06:49.1844859Z │ │ 2023-06-01T11:06:49.1845176Z │ 192 │ │ """ │ 2023-06-01T11:06:49.1845527Z │ 193 │ │ out = dict() │ 2023-06-01T11:06:49.1845909Z │ 194 │ │ for key in self._get_param_names(): │ 2023-06-01T11:06:49.1846294Z │ ❱ 195 │ │ │ value = getattr(self, key) │ 2023-06-01T11:06:49.1846708Z │ 196 │ │ │ if deep and hasattr(value, 'get_params'): │ 2023-06-01T11:06:49.1847130Z │ 197 │ │ │ │ deep_items = value.get_params().items() │ 2023-06-01T11:06:49.1847625Z │ 198 │ │ │ │ out.update((key + '__' + k, val) for k, val in deep_it │ 2023-06-01T11:06:49.1848095Z ╰──────────────────────────────────────────────────────────────────────────────╯ 2023-06-01T11:06:49.1848498Z AttributeError: 'SklearnTransform' object has no attribute 'transformer' ```

Your Environment

astrojuanlu commented 1 year ago

Hello @melvinkokxw, thanks for reporting the issue and sorry you had a bumpy upgrade.

As a workaround, are you able to add a .transformer attribute or otherwise fix your SklearnTransform.__str__ representation?

melvinkokxw commented 1 year ago

Hey Juan! No worries, we understand that such problems are bound to occur when trying to integrate with old libraries 🥲

That would be viable workaround for us, thank you for the suggestion!

noklam commented 1 year ago

@astrojuanlu I didn't expect it traces thing more than the hook call itself, maybe worth to have a look again 👀 Throwing error in user code for a debug level log seems overkilled.

astrojuanlu commented 1 year ago

Yeah I agree, it might trigger undesirable side effects. Maybe a custom subclass of TagTracer could do the trick:

https://github.com/pytest-dev/pluggy/blob/ace2dbb2ce28c106ea04a1fd0de401b79724c5e3/src/pluggy/_manager.py#L113

The error occurs before calling the tracing function (logger.debug in our case).

noklam commented 1 year ago

@astrojuanlu Should we put this into backlog grooming? I think this tracing is too chatty and it take up more than 50% of my kedro run log.

I don't really want it showing the hooks DEBUG message when it's just telemetry.

image

astrojuanlu commented 1 year ago

Yes let's prioritize this issue. We have to try to strike a balance and find a way to keep the ability to debug hook execution I think.

astrojuanlu commented 1 year ago

Reported this upstream by the way https://github.com/pytest-dev/pluggy/issues/424

astrojuanlu commented 11 months ago

The pluggy maintainers are open to fixing this upstream 🎉 in terms of estimation though, this is more a 3 than a 1