Maelic / SGG-Benchmark

A New Benchmark for Scene Graph Generation, targeting real-world applications
MIT License
34 stars 5 forks source link

Tests with different pre-trained Detection Networks #15

Open claudio-dg opened 3 months ago

claudio-dg commented 3 months ago

Hi @Maelic, thanks for your help with my previous issue, it works perfectly now! As a further step, I would like to test the SGG using a custom pre-trained detection network so I have two main questions:

1) Is it possible to re-use the pre-trained PE-NET model with a very different object detector? I am trying to test a YoloV8 model which is based on completely different classes (i.e. 27 classes instead of the 150 classes of VG150), in order to mantain the relation part but using it on different objects. Can it be done or is it required to perform a new training phase of the Relation Network (SGG model) from scratch?

2) Can I use the pre-trained networks you provided on a smaller subset of the 150 VG classes (e.g., only on the first 70 classes)?, is it again feasible, or on the contrary, is the Relation model constrained to work with all 150 classes?

I hope my questions are clear enough. As always, thank you in advance for your assistance and availability!

Maelic commented 3 months ago

Hi @claudio-dg,

These questions are good questions but unfortunately, they go a bit beyond the goal of this codebase. What you are trying to do is transfer learning (using a model trained on some data and transferring its knowledge or part of its knowledge to new data), which is a complex problem with no straightforward approach. In theory, it should be possible to load a pre-trained model on VG150 and fine-tune it on another dataset but in practice, it is a bit more complex. For instance, the PE-NET model takes as input word embeddings of the object classes that are encoded in the model (see this line). So if you want to change the classes and do some transfer learning you will have to load part of the gradients, which is not supported by my codebase right now. Oh and to do this you will also need to separate the object predictor and relation predictor, which should be possible by saving two different weights files and then loading them one after each other (which requires changing the checkpoint loader as well).

To conclude I would say that yes it is possible but it requires changing a lot of stuff in the codebase.

Best

Maelic commented 3 months ago

Regarding your second question, it is not possible to constraint the network on a subset of classes after training, the only thing you can do is filter the output on the classes that you are interested in (for that you can have a look at my custom post-processing method). You can also filter out classes in the original annotation files (the dict.json and .h5 files) and retrain the model from scratch.