statmike / vertex-ai-mlops

Google Cloud Platform Vertex AI end-to-end workflows for machine learning operations
Apache License 2.0
450 stars 202 forks source link

Tabular-dataset-create_notebook_02c #22

Open LEON240196 opened 1 year ago

LEON240196 commented 1 year ago

Hello Mike, hope everything is going well, I tried to run the notebook 02c, but it looks like there is a problem when running the pipeline, the creation of the Tabular-dataset-create throws an error and stops the process, i tried to do a debug, but the logs of the error are not very friendly. I even tried to run the pipeline only with the tabular-datset-create component and check where the problem is at, but no luck. And I also checked the IAM roles and looks like everything is ok.

I hope you can help me, thanks in advance (:

statmike commented 1 year ago

Hello @LEON240196 At first glance it looks like I have two cells out of order causing an error to fire.

The cell that starts with print(f"Review the Pipeline... need to come after the cell with

response = pipeline.run(
    service_account = SERVICE_ACCOUNT
)

Did your run create a pipeline in the console and fail or did it fail to even create the pipeline? That later was the case for me and due to the switch above. I am running this notebook with the fix now. Once it is complete I will push it to the repository and comment back here for you to give it a try.

Thank you for pointing this out!

statmike commented 1 year ago

Hello @LEON240196, I made the change above and a few more additions + corrections + clarifications throughout and tested the notebook. Please let me know if you still face any issues with the new version that is now pushed.

I appreciate your taking the time to point out the notebook was not working!

its-all-relative commented 3 months ago

Hi @statmike.

  1. This issue persists when I run current pipeline in notebook 02c. On console, the error suggests issue with tabular dataset create task. I cannot zero down the issue using the docs.

Summary Message on Console - The DAG failed because some tasks failed. The failed tasks are: [tabular-dataset-create].; Job (project_id = , job_id = ) is failed due to the above error.; Failed to handle the job: {project_number = , job_id = }

Screenshot 2024-03-12 at 2 05 11 PM

2. Since endpoint location is not set, it defaults to us-central1. My other services under my project are all in asia-southeast1(Singapore). Does it impact in anyway that the endpoint is deployed in different region?

Can you guide? Thanks.