Closed noahzn closed 9 months ago
The TF output looks very different from the one in the convert_to_pytorch.ipynb
notebook - did you change any parameter there? If not, this is very surprising. I used tensorflow==1.15.0
and torch==1.13.1+cu117
, what versions are you using? You might need to compare the results layer by layer to find out where the discrepancy comes from - I recommend starting from the raw score map.
The TF output looks very different from the one in the
convert_to_pytorch.ipynb
notebook - did you change any parameter there? If not, this is very surprising. I usedtensorflow==1.15.0
andtorch==1.13.1+cu117
, what versions are you using? You might need to compare the results layer by layer to find out where the discrepancy comes from - I recommend starting from the raw score map.
Hi, this is my own model after training. I'm using TF1.15, and the highest pytorch version I can install is 1.10.0 with cu10.2 because my python version is 3.6. What python version are you using?
I'm using Python 3.7. What is reported in the cell checking the different of dense outputs?
Diff logits: 3.1471252e-05 4.6826185e-06 3.874302e-06 max/mean/median
Diff descriptors: 2.041459e-06 2.7050896e-07 2.2351742e-07 max/mean/median
You could maybe increase the detection threshold for the PyTorch model, but this would not solve the underlying issue - there must be an implementation difference somewhere.
I'm using Python 3.7. What is reported in the cell checking the different of dense outputs?
Diff logits: 3.1471252e-05 4.6826185e-06 3.874302e-06 max/mean/median Diff descriptors: 2.041459e-06 2.7050896e-07 2.2351742e-07 max/mean/median
You could maybe increase the detection threshold for the PyTorch model, but this would not solve the underlying issue - there must be an implementation difference somewhere.
Diff logits: 3.4570694e-05 3.1457146e-06 2.6226044e-06 max/mean/median Diff descriptors: 7.6293945e-06 2.6236998e-07 2.0861626e-07 max/mean/media
I ran the conversion code on the official sp_v6 checkpoint, and I can output the same numbers of points as shown in the notebook. But the diff numbers are different:
Diff logits: 3.3140182e-05 4.75111e-06 4.053116e-06 max/mean/median Diff descriptors: 1.9967556e-06 2.7098568e-07 2.2351742e-07 max/mean/median
The diff numbers are of the same order of magnitude. The issue must be in the keypoint selection, somewhere in this section: https://github.com/rpautrat/SuperPoint/blob/d8ebb9040fac489e23dd0b6f136976c329eed3ba/superpoint_pytorch.py#L135-L157 Do you mind sharing your checkpoint?
The diff numbers are of the same order of magnitude. The issue must be in the keypoint selection, somewhere in this section:
Do you mind sharing your checkpoint?
Here is my checkpoint.
Thanks. I found the issue and fixed it in PR https://github.com/rpautrat/SuperPoint/pull/317. I suggest increasing your detection threshold to 0.01, the results will look closer to the model we have trained.
Thanks. I found the issue and fixed it in PR #317. I suggest increasing your detection threshold to 0.01, the results will look closer to the model we have trained.
Thank you very much for the fix! Now the results look closer! Since I am going to use this trained model with LightGlue, I am wondering if glue-factory's open SuperPoint code has a similar nms problem?? I mean, if I used the converted PyTorch model that I generated yesterday, will it affect the training of LightGlue? Because LightGlue uses the model's output as input.
No, your converted model and glue-factory's open SP are fine - the problem was in the inference TensorFlow model.
Hi, I use the
convert_to_pytorch
code to convert my TF model. The results are very different.I used the default parameters. Could you help me with that? Thank you. @rpautrat @sarlinpe