Closed AwePhD closed 1 year ago
I am sorry that our code is confusing you, the presence or absence of the .cpu() operation is irrelevant in (L85) code. This is because qids, gidsy are always on the CPU and the calculated CMC and mAP are all on the CPU.
Anyway, thank you for pointing this out and we have modified the potentially misleading code at L85.
First, the modification that I pointed in my issue made the rank
function crashing. I do not have access to my computer so I cannot provide the error message right now. I will provide the error message as soon as possible. If you do not have the error message it's, maybe, because I use a recent torch version? In brief, if I do not send IDs on GPU then I cannot evaluate the epoch with itc
loss. Also, I will test the updated code with my laptop and get you up to date.
Second, your repo is clean code. I read most of your 3k lines and I can tell your code has a good quality, in addition of good research findings. Thanks again for your quick answers and your sharing.
I found the issue you describe above occurs in pytorch version 1.13.1, so I guess it is caused by the different pytorch version (I use pytorch1.9.0), maybe the up-to-date version of pytorch has disabled the indices operation on different devices of L17 in metrics.py.
To fix this, I suggest that you add a .cpu() function after the indices to transfer them to the cpu, as the minimum modification solution. Like this:
pred_labels = g_pids[indices.cpu()]
And I will add this adaptation code in next commit.
Hello,
In module
utils.metrics
, there is theEvaluator
class and its private method_compute_embedding
for computing the features and IDs for texts and images on the whole test dataset.On line L60 and L70 we must add
to(device)
at the end of the concatenatedTensor
of IDs.If we do not send those ids to the GPU then we have a problem by computing tensors that are not on the same tensor. In method
eval
, we use the helper functionrank
to compute metrics. It computes on the similarity (GPU) and IDs (CPU). We get an error if we do not send the IDs to GPU. Plus, on the next line (L85) we see that the metrics are sent to the CPU. Then, those metrics were supposed to be on GPU. Thus IDs should have gone to GPU.I tried to give some details even if I think this is a very tiny fix / error. I can do a PR if you want with the fix aforementioned. :)
Best, Mathias.