lewagon / data-engineering-setup

10 stars 11 forks source link

SQL Query breaks on main.py of MLOps challenges. #47

Closed boemer00 closed 6 months ago

boemer00 commented 6 months ago

issue: GCP project IDs are randomly generated. However, sometimes GCP creates project IDs with words that are also SQL keywords, such INNER or UNION. It might not be common, but I have seen it happen a couple of times.

when: MLOps Week interface > main.py

Change: query codes on main.py file should change from:

query = f"""
        SELECT {",".join(COLUMN_NAMES_RAW)}
        FROM {GCP_PROJECT_WAGON}.{BQ_DATASET}.raw_{DATA_SIZE}
        WHERE pickup_datetime BETWEEN '{min_date}' AND '{max_date}'
        ORDER BY pickup_datetime
    """

to

query = f"""
        SELECT {",".join(COLUMN_NAMES_RAW)}
        FROM `{GCP_PROJECT_WAGON}.{BQ_DATASET}.raw_{DATA_SIZE}`
        WHERE pickup_datetime BETWEEN '{min_date}' AND '{max_date}'
        ORDER BY pickup_datetime
    """

Just by wrapping the FROM statement with and solves the issue.

lorcanrae commented 6 months ago

@boemer00 Not sure this is the correct repo for this. Could create an issues on lewagon/teachers/issues?

The note about wrapping source table in backtick's is definitely valid.