Cysu / open-reid

Open source person re-identification library in python
https://cysu.github.io/open-reid/
MIT License
1.34k stars 349 forks source link

Speeding up CMC #42

Open Rizhiy opened 6 years ago

Rizhiy commented 6 years ago

CMC is currently taking a long time to compute, so long that for me it takes longer on dukemtmc to do validation than to calculate the epoch. I noticed that CMC is currently single-threaded, is it possible to make it multi-threaded?

e.g. can I pool this loop:

    for i in range(m):
        # Filter out the same id and same camera
        valid = ((gallery_ids[indices[i]] != query_ids[i]) |
                 (gallery_cams[indices[i]] != query_cams[i]))
        if separate_camera_set:
            # Filter out samples from same camera
            valid &= (gallery_cams[indices[i]] != query_cams[i])
        if not np.any(matches[i, valid]):
            continue
        if single_gallery_shot:
            repeat = 10
            gids = gallery_ids[indices[i][valid]]
            inds = np.where(valid)[0]
            ids_dict = defaultdict(list)
            for j, x in zip(inds, gids):
                ids_dict[x].append(j)
        else:
            repeat = 1
        for _ in range(repeat):
            if single_gallery_shot:
                # Randomly choose one instance for each id
                sampled = (valid & _unique_sample(ids_dict, len(valid)))
                index = np.nonzero(matches[i, sampled])[0]
            else:
                index = np.nonzero(matches[i, valid])[0]
            delta = 1. / (len(index) * repeat)
            for j, k in enumerate(index):
                if k - j >= topk:
                    break
                if first_match_break:
                    ret[k - j] += 1
                    break
                ret[k - j] += delta
        num_valid_queries += 1

?

Cysu commented 6 years ago

@Rizhiy, thank you very much for spotting the issue. Multi-threading would definitely help. But I’m not sure if simply using Pool is Ok, as the current implementation writes to the same ret array. Maybe need to create multiple resulting CMC arrays each for a query and reduce them after the pool.

Any pull requests are welcome! I will also try to look into the problem.