pyg-team / pytorch_geometric

Graph Neural Network Library for PyTorch
https://pyg.org
MIT License
20.95k stars 3.61k forks source link

Market 1501 for person re-id #3863

Open steveazzolin opened 2 years ago

steveazzolin commented 2 years ago

🚀 The feature, motivation and pitch

Hi,

I was thinking to contribute to PyG by adding the Market 1501 dataset for person re-id, which to the best of my knowledge would be the first dataset for person re-id in PyG. I worked on it for a project at the University, but without using GNNs. However, I have some doubts about it:

One way to deal with that, as proposed by @rusty1s could be to develop a specific dataset named after a single paper. However, this may require implementing the paper from scratch, since not everybody releases their code :( This would be definitely interesting, but may not be feasible since I lack the computational resources to reproduce middle-to-large models.

Alternatives

An alternative solution could be to come up with a dataset structured as we decide, without following a specific paper. For example, we may structure the graph in a way similar to the first linked paper, but without all the steps presented in it. The code can still be modular, for example by allowing the user to specify a network to use to extract the initial node features, but keeping the structure of the graph fixed.

I'm happy to discuss whether this can be a helpful contribution or not. Thanks

Additional context

No response

rusty1s commented 2 years ago

Thanks for the issue. The differences in dataset creation among papers make this indeed a little bit more challenging than I expected :)

I like your alternative solution which probably makes the most sense here. It also might not be totally necessary to replicate the results of the individual papers, but give people more of a sense on how Person re-id works with GNNs. Please let me know which parts of dataset creation you wanna see customizable and how we obtain the final graph from each image.

steveazzolin commented 2 years ago

I was thinking to use a similar approach as presented in the first linked paper, i.e., treat the person re-id as a sort of node classification problem. More in detail:

image

In our formulation, G is fully-connected and E represents the set of relationships between different probe-gallery pairs, where Wij is a scalar edge weight. It represents the relation importance between node i and node j where gi and gj are the i-th and j-th gallery images. S() is a pairwise similarity estimation function, that estimates the similarity score between gi and gj. The purpose of setting Wii to 0 is to avoid self-enhancing.

For the training graphs, we could create again a graph for each train image, with as nodes the pairs of all images of the same identity (or maybe a subset to avoid that identities with many images are oversampled?) plus some distractors (better if distractors are taken from hard examples I guess, like the most similar images of different identities wrt the query image)

The CNN used to compute the similarity between images can be specified by the user. These embeddings can be computed in a pre_transform fashion, so to make successive reading of the same dataset faster.

rusty1s commented 2 years ago

Sounds good. Does there exist some reference implementation of the paper you mentioned?

steveazzolin commented 2 years ago

Unfortunately seems not :(