Open Jeferson100 opened 2 days ago
The following error occurred when running the command poetry poe run-digital-data-etl in the command line.
(LLM-Engineers-Handbook) PS C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook> poetry poe run-digital-data-etl Poe => poetry run python -m tools.run --run-etl --no-cache --etl-config-filename digital_data_etl_maxime_labonne.yaml 2024-11-13 21:05:06.001 | INFO | llm_engineering.settings:load_settings:94 - Loading settings from the ZenML secret store. Your ZenML client version (0.67.0) does not match the server version (0.68.1). This version mismatch might lead to errors or unexpected behavior. To disable this warning message, set the environment variable ZENML_DISABLE_CLIENT_SERVER_MISMATCH_WARNING=True 2024-11-13 21:05:08.831 | WARNING | llm_engineering.settings:load_settings:99 - Failed to load settings from the ZenML secret store. Defaulting to loading the settings from the '.env' file. 2024-11-13 21:05:08.929 | INFO | llm_engineering.infrastructure.db.mongo:__new__:20 - Connection to MongoDB with URI successful: mongodb://llm_engineering:llm_engineering@127.0.0.1:27017 PyTorch version 2.4.0 available. 2024-11-13 21:05:12.004 | INFO | llm_engineering.infrastructure.db.qdrant:__new__:29 - Connection to Qdrant DB with URI successful: localhost:6333 Chromedriver is already installed. USER_AGENT environment variable not set, consider setting it to identify your requests. sagemaker.config INFO - Not applying SDK defaults from location: C:\ProgramData\sagemaker\sagemaker\config.yaml sagemaker.config INFO - Not applying SDK defaults from location: C:\Users\jefer\AppData\Local\sagemaker\sagemaker\config.yaml Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2 C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\transformers\tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884 warnings.warn( Initiating a new run for the pipeline: digital_data_etl. Not including stack component settings with key orchestrator.sagemaker. ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ in _run_module_as_main:198 │ │ in _run_code:88 │ │ │ │ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\tools\run.py:200 in <module> │ │ │ │ 197 │ │ 198 │ │ 199 if __name__ == "__main__": │ │ ❱ 200 │ main() │ │ 201 │ │ │ │ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\click\core.p │ │ y:1130 in __call__ │ │ │ │ 1127 │ │ │ 1128 │ def __call__(self, *args: t.Any, **kwargs: t.Any) -> t.Any: │ │ 1129 │ │ """Alias for :meth:`main`.""" │ │ ❱ 1130 │ │ return self.main(*args, **kwargs) │ │ 1131 │ │ 1132 │ │ 1133 class Command(BaseCommand): │ │ │ │ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\click\core.p │ │ y:1055 in main │ │ │ │ 1052 │ │ try: │ │ 1053 │ │ │ try: │ │ 1054 │ │ │ │ with self.make_context(prog_name, args, **extra) as ctx: │ │ ❱ 1055 │ │ │ │ │ rv = self.invoke(ctx) │ │ 1056 │ │ │ │ │ if not standalone_mode: │ │ 1057 │ │ │ │ │ │ return rv │ │ 1058 │ │ │ │ │ # it's not safe to `ctx.exit(rv)` here! │ │ │ │ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\click\core.p │ │ y:1404 in invoke │ │ │ │ 1401 │ │ │ echo(style(message, fg="red"), err=True) │ │ 1402 │ │ │ │ 1403 │ │ if self.callback is not None: │ │ ❱ 1404 │ │ │ return ctx.invoke(self.callback, **ctx.params) │ │ 1405 │ │ │ 1406 │ def shell_complete(self, ctx: Context, incomplete: str) -> t.List["CompletionItem"]: │ │ 1407 │ │ """Return a list of completions for the incomplete value. Looks │ │ │ │ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\click\core.p │ │ y:760 in invoke │ │ │ │ 757 │ │ │ │ 758 │ │ with augment_usage_errors(__self): │ │ 759 │ │ │ with ctx: │ │ ❱ 760 │ │ │ │ return __callback(*args, **kwargs) │ │ 761 │ │ │ 762 │ def forward( │ │ 763 │ │ __self, __cmd: "Command", *args: t.Any, **kwargs: t.Any # noqa: B902 │ │ │ │ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\tools\run.py:159 in main │ │ │ │ 156 │ │ pipeline_args["config_path"] = root_dir / "configs" / etl_config_filename │ │ 157 │ │ assert pipeline_args["config_path"].exists(), f"Config file not found: {pipeline │ │ 158 │ │ pipeline_args["run_name"] = f"digital_data_etl_run_{dt.now().strftime('%Y_%m_%d_ │ │ ❱ 159 │ │ digital_data_etl.with_options(**pipeline_args)(**run_args_etl) │ │ 160 │ │ │ 161 │ if run_export_artifact_to_json: │ │ 162 │ │ run_args_etl = {} │ │ │ │ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\zenml\new\pi │ │ pelines\pipeline.py:1386 in __call__ │ │ │ │ 1383 │ │ │ return self.entrypoint(*args, **kwargs) │ │ 1384 │ │ │ │ 1385 │ │ self.prepare(*args, **kwargs) │ │ ❱ 1386 │ │ return self._run(**self._run_args) │ │ 1387 │ │ │ 1388 │ def _call_entrypoint(self, *args: Any, **kwargs: Any) -> None: │ │ 1389 │ │ """Calls the pipeline entrypoint function with the given arguments. │ │ │ │ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\zenml\new\pi │ │ pelines\pipeline.py:748 in _run │ │ │ │ 745 │ │ │ │ code_path=code_path, │ │ 746 │ │ │ │ **deployment.model_dump(), │ │ 747 │ │ │ ) │ │ ❱ 748 │ │ │ deployment_model = Client().zen_store.create_deployment( │ │ 749 │ │ │ │ deployment=deployment_request │ │ 750 │ │ │ ) │ │ 751 │ │ │ │ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\zenml\zen_st │ │ ores\rest_zen_store.py:1544 in create_deployment │ │ │ │ 1541 │ │ Returns: │ │ 1542 │ │ │ The newly created deployment. │ │ 1543 │ │ """ │ │ ❱ 1544 │ │ return self._create_workspace_scoped_resource( │ │ 1545 │ │ │ resource=deployment, │ │ 1546 │ │ │ route=PIPELINE_DEPLOYMENTS, │ │ 1547 │ │ │ response_model=PipelineDeploymentResponse, │ │ │ │ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\zenml\zen_st │ │ ores\rest_zen_store.py:4362 in _create_workspace_scoped_resource │ │ │ │ 4359 │ │ Returns: │ │ 4360 │ │ │ The created resource. │ │ 4361 │ │ """ │ │ ❱ 4362 │ │ return self._create_resource( │ │ 4363 │ │ │ resource=resource, │ │ 4364 │ │ │ response_model=response_model, │ │ 4365 │ │ │ route=f"{WORKSPACES}/{str(resource.workspace)}{route}", │ │ │ │ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\zenml\zen_st │ │ ores\rest_zen_store.py:4341 in _create_resource │ │ │ │ 4338 │ │ """ │ │ 4339 │ │ response_body = self.post(f"{route}", body=resource, params=params) │ │ 4340 │ │ │ │ ❱ 4341 │ │ return response_model.model_validate(response_body) │ │ 4342 │ │ │ 4343 │ def _create_workspace_scoped_resource( │ │ 4344 │ │ self, │ │ │ │ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\pydantic\mai │ │ n.py:568 in model_validate │ │ │ │ 565 │ │ """ │ │ 566 │ │ # `__tracebackhide__` tells pytest and some other tools to omit this function fr │ │ 567 │ │ __tracebackhide__ = True │ │ ❱ 568 │ │ return cls.__pydantic_validator__.validate_python( │ │ 569 │ │ │ obj, strict=strict, from_attributes=from_attributes, context=context │ │ 570 │ │ ) │ │ 571 │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ ValidationError: 2 validation errors for PipelineDeploymentResponse metadata.step_configurations.get_or_create_user.config.outputs.user.artifact_config Extra inputs are not permitted [type=extra_forbidden, input_value=None, input_type=NoneType] For further information visit https://errors.pydantic.dev/2.8/v/extra_forbidden metadata.step_configurations.crawl_links.config.outputs.crawled_links.artifact_config Extra inputs are not permitted [type=extra_forbidden, input_value=None, input_type=NoneType] For further information visit https://errors.pydantic.dev/2.8/v/extra_forbidden Error: Sequence aborted after failed subtask 'run-digital-data-etl-maxime'
Python: 3.11.8 Sistema: Windows Versão do SO: 10.0.22631 Nome de lançamento: 10 Arquitetura: AMD64 Versão completa: Windows-10-10.0.22631-SP0
Package Version ------------------ --------- alembic 1.8.1 annotated-types 0.7.0 asttokens 2.4.1 bcrypt 4.0.1 certifi 2024.8.30 charset-normalizer 3.4.0 click 8.1.3 cloudpickle 2.2.1 colorama 0.4.6 comm 0.2.1 debugpy 1.8.0 decorator 5.1.1 distro 1.9.0 docker 7.1.0 executing 2.0.1 gitdb 4.0.11 GitPython 3.1.43 greenlet 3.1.1 idna 3.10 ipykernel 6.29.0 ipython 8.20.0 ipywidgets 8.1.5 jedi 0.19.1 jupyter_client 8.6.0 jupyter_core 5.7.1 jupyterlab_widgets 3.0.13 Mako 1.3.6 markdown-it-py 3.0.0 MarkupSafe 3.0.2 matplotlib-inline 0.1.6 mdurl 0.1.2 mysqlclient 2.2.0 nest-asyncio 1.6.0 packaging 24.2 parso 0.8.3 passlib 1.7.4 pip 24.0 platformdirs 4.1.0 prompt-toolkit 3.0.43 psutil 5.9.8 pure-eval 0.2.2 pydantic 2.8.2 pydantic_core 2.20.1 pydantic-settings 2.6.1 Pygments 2.17.2 PyMySQL 1.1.1 python-dateutil 2.8.2 python-dotenv 1.0.1 pywin32 306 PyYAML 6.0.2 pyzmq 25.1.2 requests 2.32.3 rich 13.9.4 setuptools 65.5.0 six 1.16.0 smmap 5.0.1 SQLAlchemy 2.0.35 SQLAlchemy-Utils 0.41.2 sqlmodel 0.0.18 stack-data 0.6.3 tornado 6.4 traitlets 5.14.1 typing_extensions 4.12.2 urllib3 2.2.3 wcwidth 0.2.13 widgetsnbextension 4.0.13 zenml 0.68.1
The following error occurred when running the command poetry poe run-digital-data-etl in the command line.