Problem in training…… #238

Closed qdtdl closed 2 years ago

qdtdl commented 2 years ago

running the command line python3 experiment.py train configs/magic-point_shapes.yaml magic-point_synth There is too much:Segmentation fault (core dumped),Check failed: h != kInvalidChunkHandle Sometimes, when executed under the same conditions, the error reports are different. What's my problem?

rpautrat commented 2 years ago

Hi, This is probably an issue with one of your libraries. Can you post the full error message please?

qdtdl commented 2 years ago

tensorflow/core/common_runtime/bfc_allocator.cc:458] Check failed: c->in_use() && (c->bin_num == kInvalidBinNum) Aborted (core dumped)

tensorflow/core/common_runtime/bfc_allocator.cc:380] Check failed! h!=kInvalidChunkHandle Aborted(core dumped)

When there is Segmentation fault (core dumped), In fact, there is no other information shown

Although messages are different, I just found that all errors appear in validate. I use V100. When running the program, I find it in a very high memory usage. Maybe this is because 32G memory is not enough?

rpautrat commented 2 years ago

32G should be enough, at least if you keep a small batch size (e.g. 2).

What is your Tensorflow version?

qdtdl commented 2 years ago

As written in requirements.txt,1.12.0

rpautrat commented 2 years ago

I think this is an issue with Tensorflow. You can try to reinstall it, or maybe follow the setup described in https://github.com/rpautrat/SuperPoint/issues/173#issue-730896838, which has been shown to work well for many people.

qdtdl commented 2 years ago

I finished the training according to your suggestion. Thank you very much.

qdtdl commented 2 years ago

one more question: I've trained superpoint and got my model in ckpt, but the code match_features_demo.py load sp_v6 in savedmodel I tried to convert my model to savedmodel but it doesn't work, what should I do?

rpautrat commented 2 years ago

Hi, what did you use to convert your model? You can use the script superpoint/export_model.py. It takes as input two parameters: the config file which allows you to load your trained model, and the name of the export that you want to create.

qdtdl commented 2 years ago

With your help, I finally got the model I need. Thank you very much for your reply