zenml-io / zenml

ZenML 🙏: The bridge between ML and Ops. https://zenml.io.
https://zenml.io
Apache License 2.0
4.05k stars 436 forks source link

[BUG]: Importing `annotations` from `__future__` breaks pipeline compilation #2477

Open jlopezpena opened 8 months ago

jlopezpena commented 8 months ago

Contact Details [Optional]

No response

System Information

ZENML_LOCAL_VERSION: 0.55.3
ZENML_SERVER_VERSION: 0.55.3
ZENML_SERVER_DATABASE: mysql
ZENML_SERVER_DEPLOYMENT_TYPE: other
ZENML_CONFIG_DIR: /root/.config/zenml
ZENML_LOCAL_STORE_DIR: /root/.config/zenml/local_stores
ZENML_SERVER_URL: https://zenml.wayflyer.team
ZENML_ACTIVE_REPOSITORY_ROOT: None
PYTHON_VERSION: 3.11.6
ENVIRONMENT: native
SYSTEM_INFO: {'os': 'linux', 'linux_distro': 'ubuntu', 'linux_distro_like': 'debian', 'linux_distro_version': '23.10'}
ACTIVE_WORKSPACE: default
ACTIVE_STACK: Wayflyer DS
ACTIVE_USER: javier.lopezpena
TELEMETRY_STATUS: disabled
ANALYTICS_CLIENT_ID: d7081d0d-4f0b-4066-a1fa-7215559f9954
ANALYTICS_USER_ID: fda958f2-f055-47c7-942c-9ff3977f610e
ANALYTICS_SERVER_ID: 15db87e7-3688-4478-8c1f-f7d634988a14
INTEGRATIONS: ['evidently', 'kaniko', 'pillow', 's3', 'scipy', 'sklearn', 'slack', 'xgboost']
PACKAGES: {'babel': '2.14.0', 'deprecated': '1.2.14', 'faker': '23.2.1', 'gitpython': '3.1.42', 'jinja2': '3.1.3', 'mako': '1.3.2', 'markupsafe': '2.1.5', 'pyjwt': '2.8.0', 
'pymysql': '1.0.3', 'pyyaml': '6.0.1', 'sqlalchemy': '1.4.41', 'sqlalchemy-utils': '0.38.3', 'absl-py': '2.1.0', 'accessible-pygments': '0.0.4', 'aiobotocore': '2.5.2', 
'aiohttp': '3.9.3', 'aioitertools': '0.11.0', 'aiosignal': '1.3.1', 'alabaster': '0.7.16', 'alembic': '1.8.1', 'anyio': '4.3.0', 'appdirs': '1.4.4', 'argparse': '1.4.0', 
'asn1crypto': '1.5.1', 'asttokens': '2.4.1', 'attrs': '23.2.0', 'aws-profile-manager': '0.7.3', 'azure-common': '1.1.28', 'azure-core': '1.30.0', 'azure-mgmt-core': '1.4.0', 
'azure-mgmt-resource': '23.0.1', 'bcrypt': '4.0.1', 'beautifulsoup4': '4.12.3', 'boto3': '1.26.76', 'botocore': '1.29.161', 'certifi': '2024.2.2', 'cffi': '1.16.0', 'chardet': 
'5.2.0', 'charset-normalizer': '3.3.2', 'clarabel': '0.6.0', 'click': '8.1.3', 'click-params': '0.3.0', 'cloudpickle': '2.2.1', 'colorama': '0.4.6', 'colorcet': '3.0.1', 'comm': 
'0.2.1', 'configparser': '6.0.1', 'contourpy': '1.2.0', 'coverage': '7.4.1', 'cryptography': '41.0.7', 'cvxpy': '1.4.2', 'cycler': '0.12.1', 'debugpy': '1.8.1', 'decorator': 
'5.1.1', 'deptry': '0.12.0', 'diff-cover': '8.0.3', 'distro': '1.9.0', 'docker': '6.1.3', 'docutils': '0.20.1', 'dynaconf': '3.2.4', 'ecos': '2.0.13', 'evidently': '0.4.16', 
'executing': '2.0.1', 'fastjsonschema': '2.19.1', 'feature-engine': '1.6.2', 'filelock': '3.13.1', 'fonttools': '4.49.0', 'frozenlist': '1.4.1', 'fsspec': '2023.4.0', 'furo': 
'2024.1.29', 'gitdb': '4.0.11', 'greenlet': '3.0.3', 'h11': '0.14.0', 'httpcore': '1.0.4', 'httplib2': '0.19.1', 'httpx': '0.27.0', 'idna': '3.6', 'imagesize': '1.4.1', 
'importlib-metadata': '6.11.0', 'iniconfig': '2.0.0', 'ipykernel': '6.29.2', 'ipython': '8.21.0', 'ipywidgets': '8.1.2', 'isodate': '0.6.1', 'iterative-telemetry': '0.0.8', 
'jedi': '0.19.1', 'jmespath': '1.0.1', 'joblib': '1.3.2', 'jsonschema': '4.21.1', 'jsonschema-specifications': '2023.12.1', 'jupyter-book': '1.0.0', 'jupyter-cache': '1.0.0', 
'jupyter-client': '8.6.0', 'jupyter-core': '5.7.1', 'jupyterlab-widgets': '3.0.10', 'jupytext': '1.16.1', 'kiwisolver': '1.4.5', 'latexcodec': '2.0.1', 'linkify-it-py': '2.0.3', 
'litestar': '2.6.1', 'livereload': '2.6.3', 'markdown-it-py': '3.0.0', 'matplotlib': '3.8.3', 'matplotlib-inline': '0.1.6', 'mdit-py-plugins': '0.4.0', 'mdurl': '0.1.2', 
'msgspec': '0.18.6', 'multidict': '6.0.5', 'mypy': '1.8.0', 'mypy-extensions': '1.0.0', 'myst-nb': '1.0.0', 'myst-parser': '2.0.0', 'nbclient': '0.9.0', 'nbformat': '5.9.2', 
'nest-asyncio': '1.6.0', 'nltk': '3.8.1', 'numpy': '1.26.4', 'opentelemetry-api': '1.22.0', 'optbinning': '0.19.0', 'orjson': '3.9.14', 'ortools': '9.8.3296', 'osqp': '0.6.4', 
'packaging': '23.2', 'pandas': '2.2.0', 'pandas-stubs': '2.2.0.240218', 'param': '2.0.2', 'parso': '0.8.3', 'passlib': '1.7.4', 'pathspec': '0.12.1', 'patsy': '0.5.6', 'pexpect':
'4.9.0', 'pillow': '10.2.0', 'platformdirs': '3.11.0', 'plotly': '5.19.0', 'pluggy': '1.4.0', 'polyfactory': '2.14.1', 'prompt-toolkit': '3.0.43', 'protobuf': '4.25.3', 'psutil':
'5.9.8', 'ptyprocess': '0.7.0', 'pure-eval': '0.2.2', 'pyopenssl': '23.3.0', 'pyarrow': '15.0.0', 'pybind11': '2.11.1', 'pybtex': '0.24.0', 'pybtex-docutils': '1.0.3', 
'pycparser': '2.21', 'pyct': '0.5.0', 'pydantic': '1.10.14', 'pydata-sphinx-theme': '0.15.2', 'pygments': '2.17.2', 'pyparsing': '2.4.7', 'pytest': '8.0.0', 'pytest-cov': 
'4.1.0', 'python-dateutil': '2.8.2', 'pytz': '2024.1', 'pyzmq': '25.1.2', 'qdldl': '0.1.7.post0', 'referencing': '0.33.0', 'regex': '2023.12.25', 'requests': '2.31.0', 'rich': 
'13.7.0', 'rich-click': '1.7.3', 'ropwr': '1.0.0', 'rpds-py': '0.18.0', 'ruff': '0.2.1', 's3fs': '2023.4.0', 's3transfer': '0.6.2', 'scikit-learn': '1.4.1.post1', 'scipy': 
'1.12.0', 'scs': '3.2.4.post1', 'setuptools': '69.1.0', 'six': '1.16.0', 'slack-sdk': '3.27.0', 'smmap': '5.0.1', 'sniffio': '1.3.1', 'snowballstemmer': '2.2.0', 
'snowflake-connector-python': '3.7.0', 'sortedcontainers': '2.4.0', 'soupsieve': '2.5', 'sphinx': '7.2.6', 'sphinx-autobuild': '2024.2.4', 'sphinx-basic-ng': '1.0.0b2', 
'sphinx-book-theme': '1.1.2', 'sphinx-comments': '0.0.3', 'sphinx-copybutton': '0.5.2', 'sphinx-design': '0.5.0', 'sphinx-external-toc': '1.0.1', 'sphinx-jupyterbook-latex': 
'1.0.0', 'sphinx-multitoc-numbering': '0.1.3', 'sphinx-thebe': '0.3.1', 'sphinx-togglebutton': '0.3.2', 'sphinxcontrib-applehelp': '1.0.8', 'sphinxcontrib-bibtex': '2.6.2', 
'sphinxcontrib-devhelp': '1.0.6', 'sphinxcontrib-htmlhelp': '2.0.5', 'sphinxcontrib-jsmath': '1.0.1', 'sphinxcontrib-qthelp': '1.0.7', 'sphinxcontrib-serializinghtml': '1.1.10', 
'sqlalchemy2-stubs': '0.0.2a38', 'sqlfluff': '2.3.5', 'sqlmodel': '0.0.8', 'stack-data': '0.6.3', 'statsmodels': '0.14.1', 'tabulate': '0.9.0', 'tblib': '3.0.0', 'tenacity': 
'8.2.3', 'threadpoolctl': '3.3.0', 'toml': '0.10.2', 'tomlkit': '0.12.3', 'tornado': '6.4', 'tqdm': '4.66.2', 'traitlets': '5.14.1', 'typer': '0.9.0', 'types-pytz': 
'2024.1.0.20240203', 'typing-extensions': '4.9.0', 'typing-inspect': '0.9.0', 'tzdata': '2024.1', 'uc-micro-py': '1.0.3', 'urllib3': '1.26.18', 'uvicorn': '0.27.1', 'validators':
'0.18.2', 'watchdog': '4.0.0', 'wcwidth': '0.2.13', 'websocket-client': '1.7.0', 'wf-aws': '1.13.4', 'wf-blackbox': '2.0.10', 'wf-cli': '1.27.6', 'wf-cli-plugin-aws': '1.5.7', 
'wf-cli-plugin-docs': '1.2.7', 'wf-cli-plugin-login': '1.0.5', 'wf-cli-plugin-py': '2.8.4', 'wf-codec': '1.27.1', 'wf-ds-tools': '0.9.3', 'wf-env': '1.1.0', 'wf-memo': '1.1.1', 
'wf-mlmodels-delinquency': '0.0.0', 'wheel': '0.42.0', 'widgetsnbextension': '4.0.10', 'wrapt': '1.16.0', 'xgboost': '2.0.3', 'yarl': '1.9.4', 'zenml': '0.55.3', 'zipp': 
'3.17.0'}

CURRENT STACK

Name: Wayflyer DS
ID: 3876302e-0de4-439d-81b2-de954dc56eed
User: javier.lopezpena / fda958f2-f055-47c7-942c-9ff3977f610e
Workspace: default / ad453c60-e306-48a2-bb48-102098f0a3a9

ORCHESTRATOR: default

Name: default
ID: c79ec6fb-89f7-4d2b-b642-19f4d070820d
Type: orchestrator
Flavor: local
Configuration: {}
Workspace: default / ad453c60-e306-48a2-bb48-102098f0a3a9

ARTIFACT_STORE: s3

Name: s3
ID: e372107f-0349-4b93-9b47-5645339bcfb8
Type: artifact_store
Flavor: s3
Configuration: {'authentication_secret': None, 'path': 's3://wf-zenml/artifacts', 'key': '********', 'secret': '********', 'token': '********', 'client_kwargs': None, 
'config_kwargs': None, 's3_additional_kwargs': None}
User: dave.hall / da3bf4ff-c7e1-4f85-9bd5-b3f779e03954
Workspace: default / ad453c60-e306-48a2-bb48-102098f0a3a9

ALERTER: DS Alerter

Name: DS Alerter
ID: e7c058d4-96c5-4caa-8115-a7a998ec86a5
Type: alerter
Flavor: slack
Configuration: {'slack_token': '********', 'default_slack_channel_id': 'C064J8NMV8V'}
User: javier.lopezpena / fda958f2-f055-47c7-942c-9ff3977f610e
Workspace: default / ad453c60-e306-48a2-bb48-102098f0a3a9

DATA_VALIDATOR: EvidentlyAI Data Validator

Name: EvidentlyAI Data Validator
ID: b0212d0f-b3f4-487a-8cb8-7e075f3b8138
Type: data_validator
Flavor: evidently
Configuration: {}
User: javier.lopezpena / fda958f2-f055-47c7-942c-9ff3977f610e
Workspace: default / ad453c60-e306-48a2-bb48-102098f0a3a9

What happened?

Developing a new pipeline, I added a helper method to refactor one of the steps. In order to type-hint that helper method (which returns a pandas Series, so the type hint needed to be pd.Series[float]) I added at the top of the file where the step was defined the line

from __future__ import annotations

This caused my pipeline to stop working, with the very confusing error

AttributeError: 'str' object has no attribute '__mro__'

Reproduction steps

  1. Just add from __future__ import annotations to the file defining some pipeline steps, and the pipeline run will break at pipeline compilation time.

(I was using python 3.11.6)

Relevant log output

Initiating a new run for the pipeline: data_preparation.
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /workspaces/mlmodels-delinquency/wf/mlmodels_delinquency/pipelines/data_preparation.py:24 in     │
│ <module>                                                                                         │
│                                                                                                  │
│   21                                                                                             │
│   22 if __name__ == "__main__":                                                                  │
│   23 │   sql_path = cast(Path, importlib.resources.files("wf.mlmodels_delinquency") / "sql")     │
│ ❱ 24 │   data_preparation(frequency=365, cutoff_date="2024-01-01", sql_path=sql_path.resolve(    │
│   25                                                                                             │
│                                                                                                  │
│ /workspaces/mlmodels-delinquency/.venv/lib/python3.11/site-packages/zenml/new/pipelines/pipeline │
│ .py:1526 in __call__                                                                             │
│                                                                                                  │
│   1523 │   │   │   return self.entrypoint(*args, **kwargs)                                       │
│   1524 │   │                                                                                     │
│   1525 │   │   self.prepare(*args, **kwargs)                                                     │
│ ❱ 1526 │   │   return self._run(**self._run_args)                                                │
│   1527 │                                                                                         │
│   1528 │   def _call_entrypoint(self, *args: Any, **kwargs: Any) -> None:                        │
│   1529 │   │   """Calls the pipeline entrypoint function with the given arguments.               │
│                                                                                                  │
│ /workspaces/mlmodels-delinquency/.venv/lib/python3.11/site-packages/zenml/new/pipelines/pipeline │
│ .py:625 in _run                                                                                  │
│                                                                                                  │
│    622 │   │   logger.info(f"Initiating a new run for the pipeline: `{self.name}`.")             │
│    623 │   │                                                                                     │
│    624 │   │   with track_handler(AnalyticsEvent.RUN_PIPELINE) as analytics_handler:             │
│ ❱  625 │   │   │   deployment, pipeline_spec, schedule, build = self._compile(                   │
│    626 │   │   │   │   config_path=config_path,                                                  │
│    627 │   │   │   │   run_name=run_name,                                                        │
│    628 │   │   │   │   enable_cache=enable_cache,                                                │
│                                                                                                  │
│ /workspaces/mlmodels-delinquency/.venv/lib/python3.11/site-packages/zenml/new/pipelines/pipeline │
│ .py:1160 in _compile                                                                             │
│                                                                                                  │
│   1157 │   │   # Update with the values in code so they take precedence                          │
│   1158 │   │   run_config = pydantic_utils.update_model(run_config, update=update)               │
│   1159 │   │                                                                                     │
│ ❱ 1160 │   │   deployment, pipeline_spec = Compiler().compile(                                   │
│   1161 │   │   │   pipeline=self,                                                                │
│   1162 │   │   │   stack=Client().active_stack,                                                  │
│   1163 │   │   │   run_configuration=run_config,                                                 │
│                                                                                                  │
│ /workspaces/mlmodels-delinquency/.venv/lib/python3.11/site-packages/zenml/config/compiler.py:112 │
│ in compile                                                                                       │
│                                                                                                  │
│   109 │   │   │   if ConfigurationLevel.STEP in settings.LEVEL                                   │
│   110 │   │   }                                                                                  │
│   111 │   │                                                                                      │
│ ❱ 112 │   │   steps = {                                                                          │
│   113 │   │   │   invocation_id: self._compile_step_invocation(                                  │
│   114 │   │   │   │   invocation=invocation,                                                     │
│   115 │   │   │   │   pipeline_settings=settings_to_passdown,                                    │
│                                                                                                  │
│ /workspaces/mlmodels-delinquency/.venv/lib/python3.11/site-packages/zenml/config/compiler.py:113 │
│ in <dictcomp>                                                                                    │
│                                                                                                  │
│   110 │   │   }                                                                                  │
│   111 │   │                                                                                      │
│   112 │   │   steps = {                                                                          │
│ ❱ 113 │   │   │   invocation_id: self._compile_step_invocation(                                  │
│   114 │   │   │   │   invocation=invocation,                                                     │
│   115 │   │   │   │   pipeline_settings=settings_to_passdown,                                    │
│   116 │   │   │   │   pipeline_extra=pipeline.configuration.extra,                               │
│                                                                                                  │
│ /workspaces/mlmodels-delinquency/.venv/lib/python3.11/site-packages/zenml/config/compiler.py:470 │
│ in _compile_step_invocation                                                                      │
│                                                                                                  │
│   467 │   │   parameters_to_ignore = (                                                           │
│   468 │   │   │   set(step_config.parameters) if step_config else set()                          │
│   469 │   │   )                                                                                  │
│ ❱ 470 │   │   complete_step_configuration = invocation.finalize(                                 │
│   471 │   │   │   parameters_to_ignore=parameters_to_ignore                                      │
│   472 │   │   )                                                                                  │
│   473 │   │   return Step(spec=step_spec, config=complete_step_configuration)                    │
│                                                                                                  │
│ /workspaces/mlmodels-delinquency/.venv/lib/python3.11/site-packages/zenml/steps/step_invocation. │
│ py:160 in finalize                                                                               │
│                                                                                                  │
│   157 │   │   │   │   artifact.upload_by_value()                                                 │
│   158 │   │   │   external_artifacts[key] = artifact.config                                      │
│   159 │   │                                                                                      │
│ ❱ 160 │   │   return self.step._finalize_configuration(                                          │
│   161 │   │   │   input_artifacts=self.input_artifacts,                                          │
│   162 │   │   │   external_artifacts=external_artifacts,                                         │
│   163 │   │   │   model_artifacts_or_metadata=self.model_artifacts_or_metadata,                  │
│                                                                                                  │
│ /workspaces/mlmodels-delinquency/.venv/lib/python3.11/site-packages/zenml/steps/base_step.py:114 │
│ 2 in _finalize_configuration                                                                     │
│                                                                                                  │
│   1139 │   │   │   │   materializer_sources = []                                                 │
│   1140 │   │   │   │                                                                             │
│   1141 │   │   │   │   for output_type in output_types:                                          │
│ ❱ 1142 │   │   │   │   │   materializer_class = materializer_registry[output_type]               │
│   1143 │   │   │   │   │   materializer_sources.append(                                          │
│   1144 │   │   │   │   │   │   source_utils.resolve(materializer_class)                          │
│   1145 │   │   │   │   │   )                                                                     │
│                                                                                                  │
│ /workspaces/mlmodels-delinquency/.venv/lib/python3.11/site-packages/zenml/materializers/material │
│ izer_registry.py:74 in __getitem__                                                               │
│                                                                                                  │
│    71 │   │   Returns:                                                                           │
│    72 │   │   │   `BaseMaterializer` subclass that was registered for this key.                  │
│    73 │   │   """                                                                                │
│ ❱  74 │   │   for class_ in key.__mro__:                                                         │
│    75 │   │   │   materializer = self.materializer_types.get(class_, None)                       │
│    76 │   │   │   if materializer:                                                               │
│    77 │   │   │   │   return materializer                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: 'str' object has no attribute '__mro__'

Code of Conduct

jlopezpena commented 1 day ago

Sorry for necroing this, but Python 3.14 alpha 1 got released, and among the major new features it has PEP 649 - deferred evaluation of annotations, which means the behaviour of from __future__ import annotations (stringified annotations) will become the default in python 3.14. This will break zenml use of annotations in steps and pipelines, so this issue will be a blocker for python 3.14 compatibility

schustmi commented 1 day ago

@jlopezpena Thanks for the information, we'll definitely have to rework this once 3.14 is released. I don't think this will be a quick/easy fix though, so we probably won't get to it before that unfortunately.

jlopezpena commented 1 day ago

@schustmi of course, no rush from my side (I am aware of the issue and know how to sidestep it), just wanted to point out that it will become a blocker so that you can provision accordingly, since as you said this is likely to require some major reworks of the pipeline building code