GoogleCloudPlatform / mlops-with-vertex-ai

An end-to-end example of MLOps on Google Cloud using TensorFlow, TFX, and Vertex AI
Apache License 2.0
341 stars 117 forks source link

Python IndexError execution error in 03-training-formalization.ipynb #23

Open hilliao opened 2 years ago

hilliao commented 2 years ago

Python error encountered executing the following line at [Extract train and eval splits]:

sql_query = datasource_utils.get_training_source_query(

sql_query = datasource_utils.get_training_source_query(
    PROJECT, REGION, DATASET_DISPLAY_NAME, ml_use='UNASSIGNED', limit=5000)

Observed error:

IndexError Traceback (most recent call last) /tmp/ipykernel_1/1584844956.py in 1 print(DATASET_DISPLAY_NAME) 2 sql_query = datasource_utils.get_training_source_query( ----> 3 PROJECT, REGION, DATASET_DISPLAY_NAME, ml_use='UNASSIGNED', limit=5000) 4 5 output_config = example_gen_pb2.Output(

~/mlops-with-vertex-ai/src/common/datasource_utils.py in get_training_source_query(project, region, dataset_display_name, ml_use, limit) 55 dataset = vertex_ai.TabularDataset.list( 56 filter=f"display_name={dataset_display_name}", order_by="update_time" ---> 57 )[-1] 58 bq_source_uri = dataset.gca_resource.metadata["inputConfig"]["bigquerySource"][ 59 "uri"

IndexError: list index out of range

I can't find .list method for google.cloud.aiplatform's TabularDataset in datasource_utils.py

hilliao commented 2 years ago

Error is also reproduced at raw_data_query = datasource_utils.get_training_source_query(