tensorflow / neural-structured-learning

Training neural models with structured signals.
https://www.tensorflow.org/neural_structured_learning
Apache License 2.0
980 stars 190 forks source link

deviations from tf/keras conventions #95

Closed chrisrapson closed 2 years ago

chrisrapson commented 2 years ago

this looks like a fantastic resource, but I am struggling to get it to work. My application uses complex custom pre-processing functions, implemented as .map() on tf.data.Dataset objects. There are not yet many examples using this library, and the ones I found use relatively simplistic datasets. I've struggled to adapt the examples to my dataset, and have hit many errors trying to format my dataset appropriately. I think I need to complete pre-processing as a tf.data.Dataset, then convert to a dict for input to the model.

From a user's point of view, I think it would be ideal if adversarial training could be activated by just a boolean option to the compile() function (with some customisable hyperparameters). At the moment, it seems that there are a few places where the nsl functions are not compatible with the tf/keras functions. Could you please explain the reason for these design choices, and whether you intend to change them in the future?


I have also hit other errors, which imply that adversarial objects do not yet support all of the functionality of the keras library. I've been able to work around these ones, but they might be useful to help you identify functionality that still needs to be implemented.

  1. Providing an argument to the Adam optimizer gave an error, but simply using the string 'adam' works fine.

    adv_model.compile(optimizer='adam',
            #optimizer=keras.optimizers.Adam(0.0001),
            loss=keras.losses.sparse_categorical_crossentropy,
            metrics=['accuracy'])
  2. trying to use class_weight when training the model gave an error. Commenting it out works fine.

    adv_model.fit(adv_train_dataset,
          epochs=n_epochs,
        #   class_weight=class_weights,
          validation_data=adv_valid_dataset,
          validation_freq=1)
csferng commented 2 years ago

Hi @chrisrapson, thanks for your feedback.

why the inputs have to be dicts, rather than directly using tf.data.Dataset objects?

Regarding the input format, nsl.keras.AdversarialRegularization can handle tf.data.Dataset input, as long as each batch in the dataset is a dictionary of tensors. For example, in this tutorial the train_set_for_adv_model and test_set_for_adv_model are tf.data.Dataset objects.

The main reason of using the dictionary format is to enable accessing input labels in the Model.call() method, in order to compute supervised losses for adversarial attacks. (Input labels are generally not available to the call() method during model.fit(x=images, y=labels), so we need both images and labels in the argument x.) The dictionary format also supports use cases where the input contains multiple features (e.g. image plus metadata, multiple text fields, and structured data).

whether you intend to change them in the future?

TensorFlow 2.2 introduced Model.train_step(), which might be a better place for adding the regularization logic because it provides access to input labels naturally. If going this route, we can lift the constraint of each batch being a dictionary of tensors. However, the NSL library currently supports back to TF 1.15, so we cannot make the switch now. If you are interested, PRs are always welcome!

why you require the labels to be specified, instead of inferring them from the structure of the model?

Since the input batch is expected to be a dictionary of tensors mixing features and labels, users have to specify which keys in the dictionary are labels. Note that a Keras model may have multiple outputs (example), in such case a mapping between outputs and loss functions has to be provided by the user.

Providing an argument to the Adam optimizer gave an error, trying to use class_weight when training the model gave an error.

I will look into those errors.

csferng commented 2 years ago

Providing an argument to the Adam optimizer gave an error,

I cannot reproduce this using TensorFlow 2.3 or 2.5. As in this colab the adversarially regularized model can be trained with an Adam(2e-3) optimizer.

trying to use class_weight when training the model gave an error.

Right, AdversarialRegularization doesn't support class_weight for now. The class_weight arg is handled in Keras internals before calling into the AdversarialRegularization subclass. But at that point Keras doesn't know what the input labels look like, due to the label_key setting in AdversarialRegularization. We might be able to add class_weight support after migrating to the Model.train_step() approach.

csferng commented 2 years ago

Closing this issue for now. Feel free to reopen if you have any follow-up questions.