niall-turbitt / e2e-mlops

[DEPRECATED] Demo repository implementing an end-to-end MLOps workflow on Databricks. Project derived from dbx basic python template
109 stars 126 forks source link

Job telco-churn-initial-model-train-register was requested, but not provided in deployment file #54

Closed di-lin-mckinsey closed 2 years ago

di-lin-mckinsey commented 2 years ago

Hi,

I'm following the tutorial to run an e2e mlops pipeline. After setting up my databricks CLI and github secrets, I ran the first workflow step: dbx deploy --jobs=telco-churn-initial-model-train-register --environment=prod --files-only dbx launch --job=telco-churn-initial-model-train-register --environment=prod --as-run-submit --trace.

However, the deployment failed with following error.

May I know if I have missed any step in between? Thank you.

[dbx][2022-08-02 11:41:28.299] Starting new deployment for environment prod
[dbx][2022-08-02 11:41:28.311] Using profile provided from the project file
[dbx][2022-08-02 11:41:28.318] Found auth config from provider ProfileEnvConfigProvider, verifying it
[dbx][2022-08-02 11:41:28.318] Found auth config from provider ProfileEnvConfigProvider, verification successful
[dbx][2022-08-02 11:41:28.318] Profile e2-demo-west will be used for deployment
[dbx][2022-08-02 11:41:30.899] Auto-discovery found deployment file conf/deployment.yml
[dbx][2022-08-02 11:41:30.939] Deployment will be performed only for the following jobs: ['telco-churn-initial-model-train-register']
Traceback (most recent call last):
  File "/Users/di_lin/opt/anaconda3/envs/erebus/bin/dbx", line 8, in <module>
    sys.exit(cli())
  File "/Users/di_lin/opt/anaconda3/envs/erebus/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/Users/di_lin/opt/anaconda3/envs/erebus/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/Users/di_lin/opt/anaconda3/envs/erebus/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/di_lin/opt/anaconda3/envs/erebus/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/di_lin/opt/anaconda3/envs/erebus/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/Users/di_lin/opt/anaconda3/envs/erebus/lib/python3.9/site-packages/dbx/commands/deploy.py", line 169, in deploy
    _preprocess_deployment(deployment, requested_jobs)
  File "/Users/di_lin/opt/anaconda3/envs/erebus/lib/python3.9/site-packages/dbx/commands/deploy.py", line 254, in _preprocess_deployment
    deployment["jobs"] = _preprocess_jobs(deployment["jobs"], requested_jobs)
  File "/Users/di_lin/opt/anaconda3/envs/erebus/lib/python3.9/site-packages/dbx/commands/deploy.py", line 263, in _preprocess_jobs
    raise Exception(f"Job {requested_job_name} was requested, but not provided in deployment file")
Exception: Job telco-churn-initial-model-train-register was requested, but not provided in deployment file
[dbx][2022-08-02 11:41:32.726] Launching job telco-churn-initial-model-train-register on environment prod
[dbx][2022-08-02 11:41:32.728] Using profile provided from the project file
[dbx][2022-08-02 11:41:32.729] Found auth config from provider ProfileEnvConfigProvider, verifying it
[dbx][2022-08-02 11:41:32.729] Found auth config from provider ProfileEnvConfigProvider, verification successful
[dbx][2022-08-02 11:41:32.729] Profile e2-demo-west will be used for deployment
[dbx][2022-08-02 11:41:34.459] No additional tags provided
Traceback (most recent call last):
  File "/Users/di_lin/opt/anaconda3/envs/erebus/bin/dbx", line 8, in <module>
    sys.exit(cli())
  File "/Users/di_lin/opt/anaconda3/envs/erebus/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/Users/di_lin/opt/anaconda3/envs/erebus/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/Users/di_lin/opt/anaconda3/envs/erebus/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/di_lin/opt/anaconda3/envs/erebus/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/di_lin/opt/anaconda3/envs/erebus/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/Users/di_lin/opt/anaconda3/envs/erebus/lib/python3.9/site-packages/dbx/commands/launch.py", line 151, in launch
    run_info = _find_deployment_run(filter_string, additional_tags, as_run_submit, environment)
  File "/Users/di_lin/opt/anaconda3/envs/erebus/lib/python3.9/site-packages/dbx/commands/launch.py", line 250, in _find_deployment_run
    raise Exception(
Exception: "
                Run Submit API is available only when deployment was done with --files-only flag.
                Currently there is no deployments with such flag under given filters.
                Please re-deploy with --files-only flag and then re-run this launch command.
niall-turbitt commented 2 years ago

Hi @di-lin-mckinsey ,

Note this line in the setup:

Configure Databricks CLI connection profile

The project is designed to use 3 different Databricks CLI connection profiles: dev, staging and prod. These profiles are set in e2e-mlops/.dbx/project.json. Note that for demo purposes we use the same connection profile for each of the 3 environments. In practice each profile would correspond to separate dev, staging and prod Databricks workspaces. This project.json file will have to be adjusted accordingly to the connection profiles a user has configured on their local machine.

Given that running dbx deploy from your local machine will use the local dbx profile you will need to have this configured correctly. See [Databricks CLI connection profile](https://docs.databricks.com/dev-tools/cli/index.html#connection-profiles on how to do this. If your connection profile is called my_profile, then you will have to replace e2-demo-west with my_profile in e2e-mlops/.dbx/project.json.

Let me know if you are still running into issues after doing this

di-lin-mckinsey commented 2 years ago

Hi @niall-turbitt , thanks for your comment. I did follow the instruction to set up my connection profile. And I set my profile name as e2-demo-west to advoid changing in the project.json file. Here is the testing of my profile connection.

image

Sine the error mentioned "Job telco-churn-initial-model-train-register was requested, but not provided in deployment file", I wonder where the deployment file is and if the job telco-churn-initial-model-train-register is provided there.

di-lin-mckinsey commented 2 years ago

I see. In the deployment.yml file, the job name is PROD-telco-churn-initial-model-train-register. So should the right command be dbx deploy --jobs=PROD-telco-churn-initial-model-train-register --environment=prod --files-only?

niall-turbitt commented 2 years ago

ah good catch, that is indeed a typo in the README.md. Did the following then work for you? dbx deploy --jobs=PROD-telco-churn-initial-model-train-register --environment=prod --files-only

niall-turbitt commented 2 years ago

I've just updated the README to reflect the changes in the names of the jobs being deployed. I had updated the job names to reflect the environment the job was deployed to. This was to make the job names more distinct from one another if operating in a scenario where you had a single workspace with dev/stage/prod partitioned by ACLs. Please let me know if you run into anything else!

di-lin-mckinsey commented 2 years ago

Hi @niall-turbitt , yes, dbx deploy --jobs=PROD-telco-churn-initial-model-train-register --environment=prod --files-only works fine for me. I have also set up 3 different workspaces for dev, staging and prod with separate profiles for the the deployment to work in an actual setting. However, if you read the other issue I opened, I passed the deploy stage but failed again at the launch stage. Would you offer any insights there? Thanks.

niall-turbitt commented 2 years ago

Good to hear. Will close this issue. I've just replied to the other issue as well