tensorflow / lattice

Lattice methods in TensorFlow
Apache License 2.0
518 stars 93 forks source link

Error in setting num_keypoints to 0 #25

Closed arrowx123 closed 6 years ago

arrowx123 commented 6 years ago

I am playing with the original lattice model (calibrated_lattice_classifier) included in uci_census.py. The program crashes when setting the num_keypoints to 0, with the following error:

ValueError                                Traceback (most recent call last)
<ipython-input-55-3f6eefdaee94> in <module>()
     48                 print("start_time: " + str(start_time))
     49 
---> 50                 train_evaluation, test_evaluation = main(estimator)
     51 
     52                 elapsed_time = time.time() - start_time

<ipython-input-29-390452f3449e> in main(estimator)
     26 def main(estimator):
     27     if FLAGS.run == "train":
---> 28         train_evaluation, test_evaluation = train(estimator)
     29 
     30     elif FLAGS.run == "evaluate":

<ipython-input-28-550555971d85> in train(estimator)
     40             epochs_trained += epochs
     41             estimator.train(input_fn=get_train_input_fn(
---> 42                 batch_size=FLAGS.batch_size, num_epochs=epochs, shuffle=True
     43             ))
     44             print("Trained for {} epochs, total so far {}:".format(

/usr/local/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.pyc in train(self, input_fn, hooks, steps, max_steps, saving_listeners)
    312 
    313     saving_listeners = _check_listeners_type(saving_listeners)
--> 314     loss = self._train_model(input_fn, hooks, saving_listeners)
    315     logging.info('Loss for final step: %s.', loss)
    316     return self

/usr/local/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.pyc in _train_model(self, input_fn, hooks, saving_listeners)
    741       worker_hooks.extend(input_hooks)
    742       estimator_spec = self._call_model_fn(
--> 743           features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
    744       # Check if the user created a loss summary, and add one if they didn't.
    745       # We assume here that the summary is called 'loss'. If it is not, we will

/usr/local/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.pyc in _call_model_fn(self, features, labels, mode, config)
    723     if 'config' in model_fn_args:
    724       kwargs['config'] = config
--> 725     model_fn_results = self._model_fn(features=features, **kwargs)
    726 
    727     if not isinstance(model_fn_results, model_fn_lib.EstimatorSpec):

/usr/local/lib/python2.7/site-packages/tensorflow_lattice/python/estimators/calibrated.pyc in model_fn(features, labels, mode, config)
    549                    keypoints_initializers=kp_init_explicit,
    550                    name=_SCOPE_INPUT_CALIBRATION,
--> 551                    dtype=self._dtype))
    552           (total_prediction, prediction_projections,
    553            prediction_regularization) = self.prediction_builder(

/usr/local/lib/python2.7/site-packages/tensorflow_lattice/python/estimators/calibrated.pyc in input_calibration_layer_from_hparams(columns_to_tensors, feature_columns, hparams, quantiles_dir, keypoints_initializers, name, dtype)
    289         l2_reg=calibration_l2_regs,
    290         l1_laplacian_reg=calibration_l1_laplacian_regs,
--> 291         l2_laplacian_reg=calibration_l2_laplacian_regs)
    292 
    293 

/usr/local/lib/python2.7/site-packages/tensorflow_lattice/python/lib/pwl_calibration_layers.pyc in input_calibration_layer(columns_to_tensors, num_keypoints, feature_columns, keypoints_initializers, keypoints_initializer_fns, bound, monotonic, missing_input_values, missing_output_values, l1_reg, l2_reg, l1_laplacian_reg, l2_laplacian_reg, dtype)
    409     monotonic = tools.cast_to_dict(monotonic, feature_names, 'monotonic')
    410 #    import ipdb; ipdb.set_trace()
--> 411 #    keypoints_initializers = tools.cast_to_dict(
    412 #        keypoints_initializers, feature_names, 'keypoints_initializers')
    413     keypoints_initializers = {}

/usr/local/lib/python2.7/site-packages/tensorflow_lattice/python/lib/tools.pyc in cast_to_dict(v, feature_names, param_name)
     75           raise ValueError(
     76               'Dict given for %s does not contain definition for feature '
---> 77               '"%s"' % (param_name, feature_name))
     78     return v_copy
     79   return {feature_name: v for feature_name in feature_names}

ValueError: Dict given for keypoints_initializers does not contain definition for feature "age"

The problem seems to be related to this line keypoints_initializer = tools.cast_to_dict(keypoints_initializer, feature_names, 'keypoints_initializer') in the pwl_calibration_layers.py file. When num_keypoints is 0, the keypoints_initializer dict is empty, which gives rise to the error of the cast_to_dict function. After I manually comment that line and set the keypoints_initializer dict to empty, the program runs successfully. Any help is much appreciated! :heart:

mmilanifard commented 6 years ago

The number of keypoints should be >= 2 when using canned lattice estimators. See: https://github.com/tensorflow/lattice/blob/master/tensorflow_lattice/python/lib/keypoints_initialization.py#L342 Note that 2 keypoints (apart from scaling and shifting) does not do any calibration.

arrowx123 commented 6 years ago

Thank you very much Mahdi! So the minimum value of num_keypoints is 2, and this also follows the notation used in the paper (C_d in Section 9.3). Maybe it is a good idea to change the document (like this one https://github.com/tensorflow/lattice/blob/master/g3doc/api_docs/python/tensorflow_lattice/CalibratedHParams.md), because it says num_keypoints can be 0 or None?

mmilanifard commented 6 years ago

Thanks. Will do.