Teradata / jupyter-demos

16 stars 19 forks source link

Telco Customer Churn Failing on XGBoost - Classification #639

Closed JH255095 closed 5 months ago

JH255095 commented 5 months ago

The notebook performs a classification through XGBoost. However though the relevant model definition cells appear correct:

XGBoost_model = XGBoost(
                            data = df_train,
                            input_columns = ['1:8','10:33'],
                            response_column = 'Churn',
                            model_type = 'CLASSIFICATION',

)
XGBoostPredict_out = XGBoostPredict(
                                        newdata = df_test,
                                        object = XGBoost_model.result,
                                        id_column = 'CustomerID',
                                        accumulate = 'Churn',
                                        model_type = 'CLASSIFICATION',
                                        object_order_column = ['task_index', 'tree_num', 'iter', 'class_num', 'tree_order'],
                                        output_responses = ['0', '1'],
                                        output_prob = True
)

The second command errors with the message below, this has been tested with an environment created on June 3, 2024:

---------------------------------------------------------------------------
OperationalError                          Traceback (most recent call last)
File /opt/conda/lib/python3.9/site-packages/teradataml/analytics/analytic_function_executor.py:191, in _AnlyticFunctionExecutor._execute_query(self, persist, volatile)
    190 try:
--> 191     __execute(*__execute_params)
    193     # List stores names of the functions that will produce "output" attribute
    194     # when more than one results are expected.

File /opt/conda/lib/python3.9/site-packages/teradataml/common/utils.py:793, in UtilFuncs._create_view(view_name, query)
    792 try:
--> 793     UtilFuncs._execute_ddl_statement(crt_view)
    794     return True

File /opt/conda/lib/python3.9/site-packages/teradataml/common/utils.py:647, in UtilFuncs._execute_ddl_statement(ddl_statement)
    646 cursor = conn.cursor()
--> 647 cursor.execute(ddl_statement)
    649 # Warnings are displayed when the "suppress_vantage_runtime_warnings" attribute is set to 'False'.

File /opt/conda/lib/python3.9/site-packages/teradatasql/__init__.py:686, in TeradataCursor.execute(self, sOperation, params, ignoreErrors)
    685 if not params:
--> 686     self.executemany (sOperation, None, ignoreErrors)
    688 elif type (params [0]) in [list, tuple]:
    689     # Excerpt from PEP 249 DBAPI documentation:
    690     #  The parameters may also be specified as list of tuples to e.g. insert multiple rows in a single
    691     #  operation, but this kind of usage is deprecated: .executemany() should be used instead.

File /opt/conda/lib/python3.9/site-packages/teradatasql/__init__.py:933, in TeradataCursor.executemany(self, sOperation, seqOfParams, ignoreErrors)
    931         return
--> 933     raise OperationalError (sErr)
    935 if self.connection.bTimingLog:

OperationalError: [Version 17.20.0.0] [Session 1038] [Teradata Database] [Error 7810] Error in function TD_XGBoostPredict: With ModelTable as Classification, ModelType argument cannot be set to Regression.
 at gosqldriver/teradatasql.formatError ErrorUtil.go:88
 at gosqldriver/teradatasql.(*teradataConnection).formatDatabaseError ErrorUtil.go:216
 at gosqldriver/teradatasql.(*teradataConnection).makeChainedDatabaseError ErrorUtil.go:232
 at gosqldriver/teradatasql.(*teradataConnection).processErrorParcel TeradataConnection.go:803
 at gosqldriver/teradatasql.(*TeradataRows).processResponseBundle TeradataRows.go:2229
 at gosqldriver/teradatasql.(*TeradataRows).executeSQLRequest TeradataRows.go:814
 at gosqldriver/teradatasql.newTeradataRows TeradataRows.go:673
 at gosqldriver/teradatasql.(*teradataStatement).QueryContext TeradataStatement.go:122
 at gosqldriver/teradatasql.(*teradataConnection).QueryContext TeradataConnection.go:1304
 at database/sql.ctxDriverQuery ctxutil.go:48
 at database/sql.(*DB).queryDC.func1 sql.go:1759
 at database/sql.withLock sql.go:3437
 at database/sql.(*DB).queryDC sql.go:1754
 at database/sql.(*Conn).QueryContext sql.go:2013
 at main.goCreateRows goside.go:666
 at _cgoexp_b901301bef36_goCreateRows _cgo_gotypes.go:340
 at runtime.cgocallbackg1 cgocall.go:314
 at runtime.cgocallbackg cgocall.go:233
 at runtime.cgocallback asm_amd64.s:971
 at runtime.goexit asm_amd64.s:1571

During handling of the above exception, another exception occurred:

TeradataMlException                       Traceback (most recent call last)
Input In [67], in <cell line: 1>()
----> 1 XGBoostPredict_out = XGBoostPredict(
      2                                         newdata = df_test,
      3                                         object = XGBoost_model.result,
      4                                         id_column = 'CustomerID',
      5                                         accumulate = 'Churn',
      6                                         model_type = 'CLASSIFICATION',
      7                                         object_order_column = ['task_index', 'tree_num', 'iter', 'class_num', 'tree_order'],
      8                                         output_responses = ['0', '1'],
      9                                         output_prob = True
     10 )

File /opt/conda/lib/python3.9/site-packages/teradataml/analytics/sqle/__init__.py:110, in <lambda>(self, **kwargs)
    108 for assoc_cl in _get_associated_parent_classes(func):
    109     _c = _c + (assoc_cl, )
--> 110 globals()[func] = type("{}".format(func), _c, {"__init__": lambda self, **kwargs: _common_init(self, 'sqle',
    111                                                           **kwargs), "__doc__": _AnalyticFunction.__doc__})

File /opt/conda/lib/python3.9/site-packages/teradataml/analytics/meta_class.py:188, in _common_init(self, function_type, **kwargs)
    186 if function_type == 'sqle':
    187     from teradataml.analytics.analytic_function_executor import _SQLEFunctionExecutor
--> 188     self.obj = _SQLEFunctionExecutor(self.__class__.__name__)._execute_function(**kwargs)
    189 elif function_type == 'uaf':
    190     from teradataml.analytics.analytic_function_executor import _UAFFunctionExecutor

File /opt/conda/lib/python3.9/site-packages/teradataml/analytics/analytic_function_executor.py:711, in _AnlyticFunctionExecutor._execute_function(self, skip_input_arg_processing, skip_output_arg_processing, skip_other_arg_processing, skip_func_output_processing, skip_dyn_cls_processing, **kwargs)
    708 if display.print_sqlmr_query:
    709     print(self.sqlmr_query)
--> 711 self._execute_query(persist, volatile)
    713 if not skip_func_output_processing:
    714     self._process_function_output(**kwargs)

File /opt/conda/lib/python3.9/site-packages/teradataml/analytics/analytic_function_executor.py:210, in _AnlyticFunctionExecutor._execute_query(self, persist, volatile)
    207             print("{} data stored in table '{}'".format(output_attribute, table_name))
    209 except Exception as emsg:
--> 210     raise TeradataMlException(Messages.get_message(MessageCodes.TDMLDF_EXEC_SQL_FAILED, str(emsg)),
    211                               MessageCodes.TDMLDF_EXEC_SQL_FAILED)

TeradataMlException: [Teradata][teradataml](TDML_2102) Failed to execute SQL: '[Version 17.20.0.0] [Session 1038] [Teradata Database] [Error 7810] Error in function TD_XGBoostPredict: With ModelTable as Classification, ModelType argument cannot be set to Regression.
 at gosqldriver/teradatasql.formatError ErrorUtil.go:88
 at gosqldriver/teradatasql.(*teradataConnection).formatDatabaseError ErrorUtil.go:216
 at gosqldriver/teradatasql.(*teradataConnection).makeChainedDatabaseError ErrorUtil.go:232
 at gosqldriver/teradatasql.(*teradataConnection).processErrorParcel TeradataConnection.go:803
 at gosqldriver/teradatasql.(*TeradataRows).processResponseBundle TeradataRows.go:2229
 at gosqldriver/teradatasql.(*TeradataRows).executeSQLRequest TeradataRows.go:814
 at gosqldriver/teradatasql.newTeradataRows TeradataRows.go:673
 at gosqldriver/teradatasql.(*teradataStatement).QueryContext TeradataStatement.go:122
 at gosqldriver/teradatasql.(*teradataConnection).QueryContext TeradataConnection.go:1304
 at database/sql.ctxDriverQuery ctxutil.go:48
 at database/sql.(*DB).queryDC.func1 sql.go:1759
 at database/sql.withLock sql.go:3437
 at database/sql.(*DB).queryDC sql.go:1754
 at database/sql.(*Conn).QueryContext sql.go:2013
 at main.goCreateRows goside.go:666
 at _cgoexp_b901301bef36_goCreateRows _cgo_gotypes.go:340
 at runtime.cgocallbackg1 cgocall.go:314
 at runtime.cgocallbackg cgocall.go:233
 at runtime.cgocallback asm_amd64.s:971
 at runtime.goexit asm_amd64.s:1571'
JH255095 commented 5 months ago

Most likely regarding change to Teradata Version 17.20.03.25 Following up internally.