Obtaining SVO triples as part of preprocessing

I've noticed that as part of the preprocessed training/testing/val labels that are passed into the model, some SVO triplets have been extracted for the MSR-VTT and MSVD datasets. In the paper, it's been mentioned that these have been extracted using NLTK.

I was wondering if it would be possible to share the code that uses NLTK as a tool to generate SVO triplets from a sentence in the preprocessing step, since this would be highly useful in training SAAT on other datasets.

SydCaption / SAAT

Obtaining SVO triples as part of preprocessing #27