tensorflow / lattice

Lattice methods in TensorFlow
Apache License 2.0
518 stars 94 forks source link

Multi-unit calibrator vs separation of calibrators gives different results #51

Closed vkaul11 closed 3 years ago

vkaul11 commented 4 years ago

We tried two ways of using calibrators which should be equivalent theoretically but are giving us different results. Method 1 gives us better results (the multi-unit calibrator) than Method 2 even though we use the same parameters when we separate the calibrators and combine them. Is there some issue in separating the calibrators? Method 1

 feature_input = [
        tf.compat.v1.layers.flatten(group_features[name])
          for name in ['1','2']
      ]
    feature_layer = tf.concat(feature_input, 1)

    with tf.compat.v1.variable_scope('lattice1_scope'):

      feature_calib_layer = tfl.layers.PWLCalibration(
          input_keypoints=np.linspace(
              input_min,
              input_max,
              num=num_key_points,
              dtype=np.float32),
          units=len(feature_input),
          clamp_min=True,
                clamp_max=True,
          output_min,
          output_max,
          monotonicity='increasing',
          name='feature_calib'
      )(feature_layer)

      feature_lattice = tfl.layers.Lattice(
        lattice_sizes=[2]*(len(feature_input)),
        monotonicities=['increasing']*(len(feature_input)),
        output_min=0.0,
        output_max=1.0,
        name='feature_lattice'
      )(feature_calib_layer)

Method 2

feature1_input = [
        tf.compat.v1.layers.flatten(group_features[name])
          for name in ['1']
      ]
    feature1_layer = tf.concat(feature1_input, 1)
    feature2_input = [
        tf.compat.v1.layers.flatten(group_features[name])
          for name in ['2']
      ]
    feature2_layer = tf.concat(feature2_input, 1)
    with tf.compat.v1.variable_scope('lattice1_scope'):
      feature_lattice_input = []
      feature1_calib_layer = tfl.layers.PWLCalibration(
          input_keypoints=np.linspace(
              input_min,
              input_max,
              num=num_keypoints,
              dtype=np.float32),
          units=len(feature1_input),
          clamp_min=True,
                clamp_max=True,
          output_min,
          output_max,
          monotonicity='increasing',
          name='feature1_calib'
      )(feature1_layer)
     feature2_calib_layer = tfl.layers.PWLCalibration(
          input_keypoints=np.linspace(
              input_min,
              input_max,
              num=num_keypoints,
              dtype=np.float32),
          units=len(feature2_input),
          clamp_min=True,
                clamp_max=True,
          output_min,
          output_max,
          monotonicity='increasing',
          name='feature2_calib'
      )(feature2_layer)

      feature_input.append(feature1_layer)
      feature_input.append(feature2_layer)
      feature_lattice = tfl.layers.Lattice(
        lattice_sizes=[2]*(len(feature1_input)+len(feature2_input)),
        monotonicities=['increasing']*(len(feature1_input)+len(feature2_input)),
        output_min=0.0,
        output_max=1.0,
        name='feature_lattice'
      )(keras.layers.concatenate(feature_input, axis=1))
mmilanifard commented 4 years ago

Can you please share a complete end to end code example that would reproduce the issue (e.g. full training with dummy or example data)? We actually have many tests that check if using multiple single-unit calibration layers and a multi-unit calibration layer result in the same loss. See:

https://github.com/tensorflow/lattice/blob/master/tensorflow_lattice/python/pwl_calibration_test.py#L194

https://github.com/tensorflow/lattice/blob/master/tensorflow_lattice/python/pwl_calibration_test.py#L530