Error encountered while running step1

Zhaoyue628 commented 3 years ago

Hello，rpautrat, sorry to disturb you. The following error confuse me when i ran the first step, so could you give me some suggestion?

2021-05-23 12:22:40.944712: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0 2021-05-23 12:22:40.944744: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-05-23 12:22:40.944748: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 2021-05-23 12:22:40.944751: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N 2021-05-23 12:22:40.944830: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22593 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6) [05/23/2021 12:22:44 INFO] Start training [05/23/2021 12:34:36 INFO] Iter 0: loss 4.1780, precision 0.0006, recall 0.0546 /home/cczu402/project/SuperPoint/superpoint/models/base_model.py:387: RuntimeWarning: Mean of empty slice metrics = {m: np.nanmean(metrics[m], axis=0) for m in metrics} [05/23/2021 12:34:41 INFO] Iter 1000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:34:45 INFO] Iter 2000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:34:50 INFO] Iter 3000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:34:55 INFO] Iter 4000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:34:59 INFO] Iter 5000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:35:04 INFO] Iter 6000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:35:10 INFO] Iter 7000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:35:16 INFO] Iter 8000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:35:20 INFO] Iter 9000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:35:26 INFO] Iter 10000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:35:32 INFO] Iter 11000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:35:37 INFO] Iter 12000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:35:42 INFO] Iter 13000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:35:47 INFO] Iter 14000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:35:52 INFO] Iter 15000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:35:58 INFO] Iter 16000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:36:04 INFO] Iter 17000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:36:09 INFO] Iter 18000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:36:14 INFO] Iter 19000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:36:18 INFO] Iter 20000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:36:23 INFO] Iter 21000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:36:28 INFO] Iter 22000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:36:33 INFO] Iter 23000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:36:38 INFO] Iter 24000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:36:43 INFO] Iter 25000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:36:48 INFO] Iter 26000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:36:53 INFO] Iter 27000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:36:57 INFO] Iter 28000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:37:02 INFO] Iter 29000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:37:07 INFO] Iter 30000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:37:12 INFO] Iter 31000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:37:18 INFO] Iter 32000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:37:23 INFO] Iter 33000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:37:28 INFO] Iter 34000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:37:33 INFO] Iter 35000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:37:39 INFO] Iter 36000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:37:45 INFO] Iter 37000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:37:52 INFO] Iter 38000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:37:57 INFO] Iter 39000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:38:02 INFO] Iter 40000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:38:08 INFO] Iter 41000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:38:15 INFO] Iter 42000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:38:21 INFO] Iter 43000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:38:27 INFO] Iter 44000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:38:32 INFO] Iter 45000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:38:38 INFO] Iter 46000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:38:44 INFO] Iter 47000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:38:50 INFO] Iter 48000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:38:55 INFO] Iter 49000: loss nan, precision nan, recall 0.0000 [05/23/2021 12:38:59 INFO] Training finished [05/23/2021 12:38:59 INFO] Saving checkpoint for iteration #50000 2021-05-23 12:38:59.948585: W tensorflow/core/kernels/data/cache_dataset_ops.cc:770] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the datasetwill be discarded. This can happen if you have an input pipeline similar to dataset.cache().take(k).repeat(). You should use dataset.take(k).cache().repeat() instead.

rpautrat commented 3 years ago

Hi, could you try to run the notebook visualize_synthetic-shapes.ipynb and check that the synthetic shapes and ground truth corners are correct?

Zhaoyue628 commented 3 years ago

Thank you for your guidance. I reran the code and it no longer has a zero recall. But when I run it on JupyterLab, I get an error reported. I would like some advice on how to run it correctly.

rpautrat commented 3 years ago

Hi, did you run the install command of the Readme, make install?

Zhaoyue628 commented 3 years ago

Hi, after I created the virtual environment with anaconda, I ran make install in the virtual environment following the steps in the readme.

rpautrat commented 3 years ago

The installation script is using pip, so I guess this is why. You should first install pip with anaconda: conda install pip, and then re-run the make install command.

Zhaoyue628 commented 3 years ago

Hi, I followed your instructions and re-ran it. But when I run ‘detector_repeatability_hpatches.ipynb’， it still shows that the model "superpoint" is not found. I checked with the 'pip list' command.

rpautrat commented 3 years ago

You can check if the pip is indeed the anaconda one with which pip. It should point to the installation path of anaconda.

If this doesn't work, you can also consider using only virtualenv instead of anaconda.

Zhaoyue628 commented 3 years ago

I am sorry to bother you again, I solved the previous problem with your guidance. But when I run ‘detector_repeatability_hpatches.ipynb' after training the magicpoint model, it shows the following error, I checked the path and there is no error. Do you have any other ideas as to why the error is being reported? Thank you

rpautrat commented 3 years ago

Hi, try to use an absolute path for your EXPER_PATH instead of a relative one. That way you can launch the code from anywhere on your computer without paying attention to where it is launched.

Zhaoyue628 commented 3 years ago

Thanks for your advice, it no longer reports errors after changing to absolute paths. Also I would like to ask if the experiment in 'detector_evaluation_magic-point.ipynb' contains the output file from step 3? For noisy cases, is it enough to change add_augmentation to true in configs/magic-point--shapes.

rpautrat commented 3 years ago

No, these experiments names are the output of export_detections.py, like in step 2, but used on the synthetic shapes and without the --pred_only option. And yes, for noisy cases all you need is to set add_augmentation_to_test_set to True in the configs/magic-point_shapes.yaml.

Zhaoyue628 commented 3 years ago

Again, I'm sorry, I didn't figure out how to output the results of the classical detector, directly using the second step? And how are classical_detection_shapes used?

Zhaoyue628 commented 3 years ago

I use the command ’python experiment.py evaluate configs/classical_detectors_shapes.yaml fast_synth‘, and this command creates the folder with the results of the experiment in my $EXPERIMENT_PATH/outputs/.But it shows the error as shown in the picture, what am I doing wrong?

Zhaoyue628 commented 3 years ago

Sorry, I wrote it wrong, the command I used was 'python export_detections.py configs/classical_detectors_shapes.yaml fast_synth'.

rpautrat commented 3 years ago

Hi, have a look at this issue: https://github.com/rpautrat/SuperPoint/issues/166#issuecomment-678746173. You may need to tune the detection thresholds.

Zhaoyue628 commented 3 years ago

Thanks for your help, I have solved the problem by adjusting the thresholds. Also I have a question, if I want to use the model provided by magicleap to evaluate its mAP and MLE, how do I implement it, I tried to use the method in step 2 but it doesn't work. I would like to take magicpoint and use it separately in my project and would like to make some changes based on magicpoint, in which file should I do this. I have checked the files under models file and I am a bit confused.

rpautrat commented 3 years ago

Hi, sorry but I cannot do all the work for you, you will have to do it yourself. Try to explore the code, play with it and you should understand how it works and how to modify it to do what you are aiming for.

Zhaoyue628 commented 3 years ago

Thank you for your reply! I will continue to explore it!

1z2213 commented 1 year ago

你好，rpautrat，很抱歉打扰你。当我运行第一步时，以下错误让我感到困惑，所以你能给我一些建议吗？

2021-05-23 12：22：40.944712：I 张量流/核心/common_runtime/gpu/gpu_device.cc：1511] 添加可见的 GPU 设备：0 2021-05-23 12：22：40.944744：I 张量流/核心/common_runtime/gpu/gpu_device.cc：982] 设备互连具有强度 1 边缘矩阵的流执行器： 2021-05-23 12：22：40.944748： I 张量流/核心/common_runtime/gpu/gpu_device.cc：988] 0 2021-05-23 12：22：40.944751： I 张量流/核心/common_runtime/gpu/gpu_device.cc：1001] 0： N 2021-05-23 12：22：40.944830：I tensorflow/core/common_runtime/gpu/gpu_device.cc：1115] 创建了 TensorFlow 设备（/job：localhost/replica：0/task：0/device：GPU：0具有 22593 MB 内存） -> 物理 GPU（设备：0，名称：GeForce RTX 3090，PCI 总线 ID：0000：01：00.0，计算能力：8.6） [05/23/2021 12：22：44 信息] 开始训练 [05/23/2021 12：34：36 信息] Iter 0：损失 4.1780，精度 0.0006，召回 0.0546 /home/cczu402/project/SuperPoint/superpoint/models/base_model.py：387：运行时警告：空切片指标的平均值 = {m： np.nanmean（metrics[m]， axis=0） for m in metrics}[05/23/2021 12：34：41 信息]Iter 1000：损失楠，精确楠，召回 0.0000 [05/23/2021 12：34：45 信息] Iter 2000：损失楠，精确楠，召回 0.0000 [05/23/2021 12：34：50 信息] Iter 3000：损失楠，精密楠，召回 0.0000 [05/23/2021 12：34：55 信息] Iter 4000：损失楠，精密楠，召回0.0000 [05/23/2021 12：34：59 INFO] Iter 5000：损失楠，精密楠，召回0.0000 [05/23/2021 12：35：04 INFO] Iter 6000：损失楠，精密楠，召回 0.0000 [05/23/2021 12：35：10 信息] Iter 7000：损失楠，精密楠，召回 0.0000 [05/23/2021 12：35：16 信息] Iter 8000：损失楠，精密楠，召回 0.0000 [05/23/2021 12：35：20 信息] Iter 9000：损失楠，精密楠，召回 0.0000 [05/23/2021 12：35：26 信息] Iter 10000：损失楠，精密楠，召回0.0000 [05/23/2021 12：35：32 INFO] Iter 11000：损失楠，精确楠，召回0.0000 [05/23/2021 12：35：37 INFO] Iter 12000：损失楠，精密楠，召回0.0000 [05/23/2021 12：35：42 信息] Iter 13000：损失楠，精密楠，召回0.0000 [05/23/2021 12：35：47 INFO] Iter 14000：损失楠，精密楠，召回0.0000 [05/23/2021 12：35：52 INFO] Iter 15000：损失楠，精密楠，召回0.0000 [05/23/2021 12：35：58 INFO] Iter 16000：损耗楠，精密楠，召回 0.0000 [05/23/2021 12：36：04 信息] Iter 17000：损失楠，精度楠，召回 0.0000 [05/23/2021 12：36：09 信息] Iter 18000：损失楠，精密楠，召回 0.0000 [05/23/2021 12：36：14 信息] Iter 19000：损失楠，精密楠，召回 0.0000 [05/23/2021 12：36：18 信息] Iter 20000：损失楠，精密楠，召回0.0000[05/23/2021 12：36：28 信息]Iter 22000：损失楠，精密楠，召回0.0000 [05/23/2021 12：36：33 INFO] Iter 23000：损失楠，精密楠，召回0.0000 [05/23/2021 12：36：38 信息] Iter 24000：损失楠，精密楠，召回0.0000 [05/23/2021 12：36：43 INFO] Iter 25000：损失楠，精密楠，召回0.0000 [05/23/2021 12：36：48 INFO] Iter 26000：损失楠，精密楠，召回0.0000 [05/23/2021 12：36：53 INFO] Iter27000：损失楠，精密楠，召回0.0000 [05/23/2021 12：36：57 INFO] Iter 28000：损失楠，精密楠，召回0.0000 [05/23/2021 12：37：02 INFO] Iter 29000：损失楠，精密楠，召回0.0000 [05/23/2021 12：37：07 INFO] Iter 30000：损失楠，精密楠，召回0.0000 [05/23/2021 12：37：12 INFO] Iter 31000：损失楠，精密楠，召回0.0000[05/23/2021 12：37：18 信息]Iter 32000：损失楠，精密楠，召回 0.0000 [05/23/2021 12：37：23 信息] Iter 33000：损失楠，精密楠，召回 0.0000 [05/23/2021 12：37：28 信息] Iter 34000：损失楠，精密楠，召回0.0000 [05/23/2021 12：37：33 INFO] Iter 35000：损失楠，精密楠，召回0.0000 [05/23/2021 12：37：39 INFO] Iter 36000：损失楠，精密楠，召回0.0000[05/23/2021 12：37：45 信息]Iter 37000：损失楠，精密楠，召回 0.0000 [05/23/2021 12：37：52 信息] Iter 38000：损失楠，精度楠，召回 0.0000 [05/23/2021 12：37：57 信息] Iter 39000：损失楠，精密楠，召回 0.0000 [05/23/2021 12：38：02 信息] Iter 40000：损失楠，精度楠，召回 0.0000 [05/23/2021 12：38：08 信息] Iter 41000：损失楠，精密楠，召回率 0.0000[05/23/2021 12：38：15 信息]Iter 42000：损失楠，精确楠，召回0.0000 [05/23/2021 12：38：21 INFO] Iter 43000：损失楠，精密楠，召回0.0000 [05/23/2021 12：38：27 INFO] Iter 44000：损失楠，精密楠，召回0.0000 [05/23/2021 12：38：32 INFO] Iter 45000：损失楠，精密楠，召回0.0000 [05/23/2021 12：38：38 INFO] Iter 46000：损失楠，精密楠，召回0.0000[05/23/2021 12：38：44 信息]Iter 47000：损失楠，精密楠，召回 0.0000 [05/23/2021 12：38：50 信息] Iter 48000：损失楠，精度楠，召回 0.0000 [05/23/2021 12：38：55 信息] Iter 49000：损失楠，精密楠，召回 0.0000 [05/23/2021 12：38：59 信息] 训练完成 [05/23/2021 12：38：59 信息] 保存迭代的检查点 #500002021-05-23 12：38：59.948585：W 张量流/核心/内核/数据/cache_dataset_ops.cc：770] 调用迭代器未完全读取正在缓存的数据集。为了避免数据集意外截断，数据集的部分缓存内容将被丢弃。如果您有类似于的输入管道，则可能会发生这种情况。您应该改用。dataset.cache().take(k).repeat()``dataset.take(k).cache().repeat()

Hi, I followed the instructions of https://github.com/rpautrat/SuperPoint/issues/173 and now I am trying to run the 1st step. python experiment.py train configs/magic-point_shapes.yaml magic-point_synth. I have the same problem with loss nan after extracting all syntetic shapes. Can you offer me some advice on how to solve it? I desperately need your help.

1z2213 commented 1 year ago

visualize_synthetic-shapes.ipynb

Can you please tell me how you solved the problem of it having a zero recall?

rpautrat / SuperPoint

Error encountered while running step1 #213