PacktPublishing / LLM-Engineers-Handbook

The LLM's practical guide: From the fundamentals to deploying advanced LLM and RAG apps to AWS using LLMOps best practices
https://www.amazon.com/LLM-Engineers-Handbook-engineering-production/dp/1836200072/
MIT License
1.51k stars 236 forks source link

Error in poetry poe run-digital-data-etl #15

Open Jeferson100 opened 2 days ago

Jeferson100 commented 2 days ago

The following error occurred when running the command poetry poe run-digital-data-etl in the command line.

(LLM-Engineers-Handbook) PS C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook> poetry poe run-digital-data-etl                                                                                           
Poe => poetry run python -m tools.run --run-etl --no-cache --etl-config-filename digital_data_etl_maxime_labonne.yaml
2024-11-13 21:05:06.001 | INFO     | llm_engineering.settings:load_settings:94 - Loading settings from the ZenML secret store.
Your ZenML client version (0.67.0) does not match the server version (0.68.1). This version mismatch might lead to errors or unexpected behavior. 
To disable this warning message, set the environment variable ZENML_DISABLE_CLIENT_SERVER_MISMATCH_WARNING=True
2024-11-13 21:05:08.831 | WARNING  | llm_engineering.settings:load_settings:99 - Failed to load settings from the ZenML secret store. Defaulting to loading the settings from the '.env' file.
2024-11-13 21:05:08.929 | INFO     | llm_engineering.infrastructure.db.mongo:__new__:20 - Connection to MongoDB with URI successful: mongodb://llm_engineering:llm_engineering@127.0.0.1:27017
PyTorch version 2.4.0 available.
2024-11-13 21:05:12.004 | INFO     | llm_engineering.infrastructure.db.qdrant:__new__:29 - Connection to Qdrant DB with URI successful: localhost:6333
Chromedriver is already installed.
USER_AGENT environment variable not set, consider setting it to identify your requests.
sagemaker.config INFO - Not applying SDK defaults from location: C:\ProgramData\sagemaker\sagemaker\config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: C:\Users\jefer\AppData\Local\sagemaker\sagemaker\config.yaml
Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2
C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\transformers\tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884      
  warnings.warn(
Initiating a new run for the pipeline: digital_data_etl.
Not including stack component settings with key orchestrator.sagemaker.
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in _run_module_as_main:198                                                                       │
│ in _run_code:88                                                                                  │
│                                                                                                  │
│ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\tools\run.py:200 in <module>         │
│                                                                                                  │
│   197                                                                                            │
│   198                                                                                            │
│   199 if __name__ == "__main__":                                                                 │
│ ❱ 200 │   main()                                                                                 │
│   201                                                                                            │
│                                                                                                  │
│ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\click\core.p │
│ y:1130 in __call__                                                                               │
│                                                                                                  │
│   1127 │                                                                                         │
│   1128 │   def __call__(self, *args: t.Any, **kwargs: t.Any) -> t.Any:                           │
│   1129 │   │   """Alias for :meth:`main`."""                                                     │
│ ❱ 1130 │   │   return self.main(*args, **kwargs)                                                 │
│   1131                                                                                           │
│   1132                                                                                           │
│   1133 class Command(BaseCommand):                                                               │
│                                                                                                  │
│ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\click\core.p │
│ y:1055 in main                                                                                   │
│                                                                                                  │
│   1052 │   │   try:                                                                              │
│   1053 │   │   │   try:                                                                          │
│   1054 │   │   │   │   with self.make_context(prog_name, args, **extra) as ctx:                  │
│ ❱ 1055 │   │   │   │   │   rv = self.invoke(ctx)                                                 │
│   1056 │   │   │   │   │   if not standalone_mode:                                               │
│   1057 │   │   │   │   │   │   return rv                                                         │
│   1058 │   │   │   │   │   # it's not safe to `ctx.exit(rv)` here!                               │
│                                                                                                  │
│ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\click\core.p │
│ y:1404 in invoke                                                                                 │
│                                                                                                  │
│   1401 │   │   │   echo(style(message, fg="red"), err=True)                                      │
│   1402 │   │                                                                                     │
│   1403 │   │   if self.callback is not None:                                                     │
│ ❱ 1404 │   │   │   return ctx.invoke(self.callback, **ctx.params)                                │
│   1405 │                                                                                         │
│   1406 │   def shell_complete(self, ctx: Context, incomplete: str) -> t.List["CompletionItem"]:  │
│   1407 │   │   """Return a list of completions for the incomplete value. Looks                   │
│                                                                                                  │
│ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\click\core.p │
│ y:760 in invoke                                                                                  │
│                                                                                                  │
│    757 │   │                                                                                     │
│    758 │   │   with augment_usage_errors(__self):                                                │
│    759 │   │   │   with ctx:                                                                     │
│ ❱  760 │   │   │   │   return __callback(*args, **kwargs)                                        │
│    761 │                                                                                         │
│    762 │   def forward(                                                                          │
│    763 │   │   __self, __cmd: "Command", *args: t.Any, **kwargs: t.Any  # noqa: B902             │
│                                                                                                  │
│ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\tools\run.py:159 in main             │
│                                                                                                  │
│   156 │   │   pipeline_args["config_path"] = root_dir / "configs" / etl_config_filename          │
│   157 │   │   assert pipeline_args["config_path"].exists(), f"Config file not found: {pipeline   │
│   158 │   │   pipeline_args["run_name"] = f"digital_data_etl_run_{dt.now().strftime('%Y_%m_%d_   │
│ ❱ 159 │   │   digital_data_etl.with_options(**pipeline_args)(**run_args_etl)                     │
│   160 │                                                                                          │
│   161 │   if run_export_artifact_to_json:                                                        │
│   162 │   │   run_args_etl = {}                                                                  │
│                                                                                                  │
│ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\zenml\new\pi │
│ pelines\pipeline.py:1386 in __call__                                                             │
│                                                                                                  │
│   1383 │   │   │   return self.entrypoint(*args, **kwargs)                                       │
│   1384 │   │                                                                                     │
│   1385 │   │   self.prepare(*args, **kwargs)                                                     │
│ ❱ 1386 │   │   return self._run(**self._run_args)                                                │
│   1387 │                                                                                         │
│   1388 │   def _call_entrypoint(self, *args: Any, **kwargs: Any) -> None:                        │
│   1389 │   │   """Calls the pipeline entrypoint function with the given arguments.               │
│                                                                                                  │
│ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\zenml\new\pi │
│ pelines\pipeline.py:748 in _run                                                                  │
│                                                                                                  │
│    745 │   │   │   │   code_path=code_path,                                                      │
│    746 │   │   │   │   **deployment.model_dump(),                                                │
│    747 │   │   │   )                                                                             │
│ ❱  748 │   │   │   deployment_model = Client().zen_store.create_deployment(                      │
│    749 │   │   │   │   deployment=deployment_request                                             │
│    750 │   │   │   )                                                                             │
│    751                                                                                           │
│                                                                                                  │
│ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\zenml\zen_st │
│ ores\rest_zen_store.py:1544 in create_deployment                                                 │
│                                                                                                  │
│   1541 │   │   Returns:                                                                          │
│   1542 │   │   │   The newly created deployment.                                                 │
│   1543 │   │   """                                                                               │
│ ❱ 1544 │   │   return self._create_workspace_scoped_resource(                                    │
│   1545 │   │   │   resource=deployment,                                                          │
│   1546 │   │   │   route=PIPELINE_DEPLOYMENTS,                                                   │
│   1547 │   │   │   response_model=PipelineDeploymentResponse,                                    │
│                                                                                                  │
│ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\zenml\zen_st │
│ ores\rest_zen_store.py:4362 in _create_workspace_scoped_resource                                 │
│                                                                                                  │
│   4359 │   │   Returns:                                                                          │
│   4360 │   │   │   The created resource.                                                         │
│   4361 │   │   """                                                                               │
│ ❱ 4362 │   │   return self._create_resource(                                                     │
│   4363 │   │   │   resource=resource,                                                            │
│   4364 │   │   │   response_model=response_model,                                                │
│   4365 │   │   │   route=f"{WORKSPACES}/{str(resource.workspace)}{route}",                       │
│                                                                                                  │
│ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\zenml\zen_st │
│ ores\rest_zen_store.py:4341 in _create_resource                                                  │
│                                                                                                  │
│   4338 │   │   """                                                                               │
│   4339 │   │   response_body = self.post(f"{route}", body=resource, params=params)               │
│   4340 │   │                                                                                     │
│ ❱ 4341 │   │   return response_model.model_validate(response_body)                               │
│   4342 │                                                                                         │
│   4343 │   def _create_workspace_scoped_resource(                                                │
│   4344 │   │   self,                                                                             │
│                                                                                                  │
│ C:\Users\jefer\Documents\Livros\LLMs\LLM-Engineers-Handbook\.venv\Lib\site-packages\pydantic\mai │
│ n.py:568 in model_validate                                                                       │
│                                                                                                  │
│    565 │   │   """                                                                               │
│    566 │   │   # `__tracebackhide__` tells pytest and some other tools to omit this function fr  │
│    567 │   │   __tracebackhide__ = True                                                          │
│ ❱  568 │   │   return cls.__pydantic_validator__.validate_python(                                │
│    569 │   │   │   obj, strict=strict, from_attributes=from_attributes, context=context          │
│    570 │   │   )                                                                                 │
│    571                                                                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValidationError: 2 validation errors for PipelineDeploymentResponse
metadata.step_configurations.get_or_create_user.config.outputs.user.artifact_config
  Extra inputs are not permitted [type=extra_forbidden, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.8/v/extra_forbidden
metadata.step_configurations.crawl_links.config.outputs.crawled_links.artifact_config
  Extra inputs are not permitted [type=extra_forbidden, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.8/v/extra_forbidden
Error: Sequence aborted after failed subtask 'run-digital-data-etl-maxime'
Python: 3.11.8
Sistema: Windows
Versão do SO: 10.0.22631
Nome de lançamento: 10
Arquitetura: AMD64
Versão completa: Windows-10-10.0.22631-SP0
Package            Version
------------------ ---------
alembic            1.8.1
annotated-types    0.7.0
asttokens          2.4.1
bcrypt             4.0.1
certifi            2024.8.30
charset-normalizer 3.4.0
click              8.1.3
cloudpickle        2.2.1
colorama           0.4.6
comm               0.2.1
debugpy            1.8.0
decorator          5.1.1
distro             1.9.0
docker             7.1.0
executing          2.0.1
gitdb              4.0.11
GitPython          3.1.43
greenlet           3.1.1
idna               3.10
ipykernel          6.29.0
ipython            8.20.0
ipywidgets         8.1.5
jedi               0.19.1
jupyter_client     8.6.0
jupyter_core       5.7.1
jupyterlab_widgets 3.0.13
Mako               1.3.6
markdown-it-py     3.0.0
MarkupSafe         3.0.2
matplotlib-inline  0.1.6
mdurl              0.1.2
mysqlclient        2.2.0
nest-asyncio       1.6.0
packaging          24.2
parso              0.8.3
passlib            1.7.4
pip                24.0
platformdirs       4.1.0
prompt-toolkit     3.0.43
psutil             5.9.8
pure-eval          0.2.2
pydantic           2.8.2
pydantic_core      2.20.1
pydantic-settings  2.6.1
Pygments           2.17.2
PyMySQL            1.1.1
python-dateutil    2.8.2
python-dotenv      1.0.1
pywin32            306
PyYAML             6.0.2
pyzmq              25.1.2
requests           2.32.3
rich               13.9.4
setuptools         65.5.0
six                1.16.0
smmap              5.0.1
SQLAlchemy         2.0.35
SQLAlchemy-Utils   0.41.2
sqlmodel           0.0.18
stack-data         0.6.3
tornado            6.4
traitlets          5.14.1
typing_extensions  4.12.2
urllib3            2.2.3
wcwidth            0.2.13
widgetsnbextension 4.0.13
zenml              0.68.1