How is `action_id` taken into account during training and evaluation?

mazatov commented 2 years ago

Do we have a guarantee that the player only exists in one action_idx? Going through the datasets, I see that there are lots of actions per game, and so my instinct is that there'll be lots of repeats between actions in one game as there are more person_uid per game than players on the pitch. Are we supposed to ignore the fact that we might have repeats between actions?

For example in the training dataset, the first game is 2015-02-21 - 18-00 Chelsea 1 - 1 Burnley. Looking through all the 23 actions of that game, we have 808 images for 429 players! Given that we have a maximum of 26 players per game, a lot of those 808 images correspond to the same player but have different player_id. That seems like a really bad assumption to make.

I also see evidence of that during the evaluation of results as well.

            if q_action_idx != g_action_idx:
                raise ValueError("Ranking result for query '{}' from action '{}' contained gallery sample '{}' from a different action '{}'. "
                                 "Ranking results for a given query must contain only gallery samples from the same action.".format(q_idx, q_action_idx, g_idx, g_action_idx))

At the same time during the testing phase , we seem to calculate features across the entire dataset and match persons between entire datasets of query and gallery, without taking into account action_idx. At least it looks that way in the code. Could you clarify how action_idx is defined and how it's taken into account in all parts of the pipeline( train, valid, test, challenge)?

        print('Extracting features from query set ...')
        qf, q_pids, q_camids = _feature_extraction(query_loader)
        print('Done, obtained {}-by-{} matrix'.format(qf.size(0), qf.size(1)))

        print('Extracting features from gallery set ...')
        gf, g_pids, g_camids = _feature_extraction(gallery_loader)
        print('Done, obtained {}-by-{} matrix'.format(gf.size(0), gf.size(1)))

        print('Speed: {:.4f} sec/batch'.format(batch_time.avg))

        if normalize_feature:
            print('Normalzing features with L2 norm ...')
            qf = F.normalize(qf, p=2, dim=1)
            gf = F.normalize(gf, p=2, dim=1)

        print(
            'Computing distance matrix with metric={} ...'.format(dist_metric)
        )
        distmat = metrics.compute_distance_matrix(qf, gf, dist_metric)
        distmat = distmat.numpy()

VlSomers commented 2 years ago

Hi Mazatov, thanks for your interest and your question. I uploaded a new version of the README with further information about the dataset structure. Hopefully, you'll find an answer to your question there. As you mentioned, player identities are only valid within an action, and a player might be assigned multiple identities if he has been spotted in multiple actions. This is not an issue during testing because query to gallery matching is only performed for samples from the same action. However, this might be an issue at training depending on your training procedure and losses, but it's up to the participant to find solution for that. In the original Torchreid code, they use a 'camid' field at testing stage for similar purpose, i.e., filter out gallery samples w.r.t. the camid field of the corresponding query sample. For the SoccerNet challenge, we use the 'camid' field to carry the 'action_idx' information, and use that information in the 'rank.py -> eval_soccernetv3()' function to filter out gallery samples with different 'action_idx' than the current query. As you mentioned, all query to gallery distances are computed in a big distance matrix 'distmat' in 'engine.py'. However, a lot of these distances are not useful because not taken into account for the final performance evaluation. This part should be optimised further for futur versions, for now we just kept the original Torchreid code for computing this big distance matrix.

shallowlearn commented 2 years ago

Hello, I have a related question about the actions.

Does action '0' in each match represent the same action in different matches? Or is there no connection at all?
Does action '0' in the train set match action '0' in the test set?

Thanks

SoccerNet / sn-reid

How is `action_id` taken into account during training and evaluation? #1