pzmm.ImportModel.import_model failed with TypeError after updating from 1.10.0 to 1.10.1

pulungw commented 11 months ago

Describe the issue After updating sasctl to 1.10.1, I experienced TypeError when calling pzmm.ImportModel.import_model with the same code that used to work in 1.10.0.

To Reproduce Steps or example code to reproduce the issue.

# STEP 5: Import model
pzmm.ScoreCode.score_code = ''
lreg = pzmm.ImportModel.import_model(model_files=model_path, model_prefix=model_prefix, project=project,
                                     input_data=X, predict_method=[model.predict_proba, [float, float]],
                                     score_metrics=score_metrics, overwrite_model=True,
                                     target_values=['0', '1'], model_file_name=model_prefix + ".pickle")

The rest of my sample code can be found here: https://github.com/pulungw/sascode/blob/main/python/model_manager_register_sample_sklearn.ipynb Which was based from this article: https://blogs.sas.com/content/subconsciousmusings/2023/08/11/mlops-for-pirates-and-snakes/

Expected behavior pzmm.ImportModel.import_model finished execution succesfully like in 1.10.0.

Stack Trace If you're experiencing an exception, include the full stack trace and error message.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[12], line 3
      1 # STEP 5: Import model
      2 pzmm.ScoreCode.score_code = ''
----> 3 lreg = pzmm.ImportModel.import_model(model_files=model_path, model_prefix=model_prefix, project=project,
      4                                      input_data=X, predict_method=[model.predict_proba, [float, float]],
      5                                      score_metrics=score_metrics, overwrite_model=True,
      6                                      target_values=['0', '1'], model_file_name=model_prefix + ".pickle")

File ~\AppData\Local\miniconda3\envs\ml\Lib\site-packages\sasctl\pzmm\import_model.py:351, in ImportModel.import_model(cls, model_files, model_prefix, project, input_data, predict_method, score_metrics, pickle_type, project_version, missing_values, overwrite_model, score_cas, mlflow_details, predict_threshold, target_values, overwrite_project_properties, target_index, **kwargs)
    348 # For SAS Viya 4, the score code can be written beforehand and imported with
    349 # all the model files
    350 elif current_session().version_info() == 4:
--> 351     score_code_dict = sc.write_score_code(
    352         model_prefix,
    353         input_data,
    354         predict_method,
    355         score_metrics=score_metrics,
    356         pickle_type=pickle_type,
    357         predict_threshold=predict_threshold,
    358         score_code_path=None if isinstance(model_files, dict) else model_files,
    359         target_values=target_values,
    360         missing_values=missing_values,
    361         score_cas=score_cas,
    362         target_index=target_index,
    363         **kwargs,
    364     )
    365     if score_code_dict:
    366         model_files.update(score_code_dict)

File ~\AppData\Local\miniconda3\envs\ml\Lib\site-packages\sasctl\pzmm\write_score_code.py:280, in ScoreCode.write_score_code(cls, model_prefix, input_data, predict_method, target_variable, target_values, score_metrics, predict_threshold, model, pickle_type, missing_values, score_cas, score_code_path, target_index, **kwargs)
    266         cls.score_code += (
    267             f"\n{'':4}# Check for numpy values and convert to a CAS readable "
    268             f"representation\n"
    269             f"{'':4}if isinstance(prediction, np.ndarray):\n"
    270             f"{'':8}prediction = prediction.tolist()\n\n"
    271         )
    272         """
    273 
    274 # Check for numpy values and conver to a CAS readable representation
   (...)
    278     
    279         """
--> 280         cls._predictions_to_metrics(
    281             score_metrics,
    282             predict_method[1],
    283             target_values=target_values,
    284             predict_threshold=predict_threshold,
    285             target_index=target_index,
    286         )
    288     if missing_values:
    289         cls._impute_missing_values(input_data, missing_values)

File ~\AppData\Local\miniconda3\envs\ml\Lib\site-packages\sasctl\pzmm\write_score_code.py:1138, in ScoreCode._predictions_to_metrics(cls, metrics, predict_returns, target_values, predict_threshold, h2o_model, target_index)
   1136 # Binary classification model
   1137 elif len(target_values) == 2:
-> 1138     cls._binary_target(
   1139         metrics,
   1140         target_values,
   1141         predict_returns,
   1142         predict_threshold,
   1143         target_index,
   1144         h2o_model,
   1145     )
   1146 # Multiclass classification model
   1147 elif len(target_values) > 2:

File ~\AppData\Local\miniconda3\envs\ml\Lib\site-packages\sasctl\pzmm\write_score_code.py:1507, in ScoreCode._binary_target(cls, metrics, target_values, returns, threshold, target_index, h2o_model)
   1498         elif sum(returns) == 0 and len(returns) == 2:
   1499             warn(
   1500                 "Due to the ambiguity of the provided metrics and prediction return"
   1501                 " types, the score code assumes that a classification and the "
   1502                 "target event probability should be returned."
   1503             )
   1504             cls.score_code += (
   1505                 f"{'':4}if input_array.shape[0] == 1:\n"
   1506                 f"{'':8}if prediction[0][{target_index}] > {threshold}:\n"
-> 1507                 f"{'':12}{metrics[0]} = \"{target_values[target_index]}\"\n"
   1508                 f"{'':8}else:\n"
   1509                 f"{'':12}{metrics[0]} = \"{target_values[abs(target_index-1)]}\"\n"
   1510                 f"{'':8}return {metrics[0]}, prediction[0][{target_index}]\n"
   1511                 f"{'':4}else:\n"
   1512                 f"{'':8}df = pd.DataFrame(prediction)\n"
   1513                 f"{'':8}proba = df[{target_index}]\n"
   1514                 f"{'':8}classifications = np.where(df[{target_index}] > {threshold}, '{target_values[target_index]}', '{target_values[abs(target_index-1)]}')\n"
   1515                 f"{'':8}return pd.DataFrame({{'{metrics[0]}': classifications, '{metrics[1]}': proba}})"
   1516             )
   1517             """
   1518 if input_array.shape[0] == 1:
   1519     if prediction[0][1] > .5:
   (...)
   1528     return pd.DataFrame({'Classification': classifications, 'Probability': proba})
   1529                             """
   1530         # TODO: Potentially add threshold
   1531         # Return classification and probability value

TypeError: list indices must be integers or slices, not NoneType

Version 1.10.1

SilvestriStefano commented 9 months ago

Hello, I think I have found the issue: just like the stackTrace is saying, the target_index arrives to the methods _predictions_to_metrics and _binary_target as None because the default value set in the method write_score_code is None https://github.com/sassoftware/python-sasctl/blob/7c7d9f1683c53bb69754121742706bd1a14d1a8f/src/sasctl/pzmm/write_score_code.py#L38 and not 1 as stated in the documentation https://github.com/sassoftware/python-sasctl/blob/7c7d9f1683c53bb69754121742706bd1a14d1a8f/src/sasctl/pzmm/write_score_code.py#L137

the other two methods (_predictions_to_metrics and _binary_target) do have a default value of 1.

SilvestriStefano commented 9 months ago

nevermind.... the very first two lines of _prediction_to_metrics are https://github.com/sassoftware/python-sasctl/blob/1018767a5d805e32a38abe1ba3eea2d0e5445904/src/sasctl/pzmm/write_score_code.py#L1156 https://github.com/sassoftware/python-sasctl/blob/1018767a5d805e32a38abe1ba3eea2d0e5445904/src/sasctl/pzmm/write_score_code.py#L1157

sassoftware / python-sasctl

pzmm.ImportModel.import_model failed with TypeError after updating from 1.10.0 to 1.10.1 #184