JialianW / GRiT

GRiT: A Generative Region-to-text Transformer for Object Understanding (https://arxiv.org/abs/2212.00280)
MIT License
302 stars 30 forks source link

Dense Captioning Evaluation on VG Dataset #6

Closed Wykay closed 1 year ago

Wykay commented 1 year ago

Hello,

I am currently tring to reproduce the result of task dense captioning of GRiT. I have trained the model by default setting and got the checkpoint of it. Then I ran inference on VG test set and got the json result by

python train_net.py --num-gpus-per-machine 8 --config-file configs/GRiT_B_DenseCap.yaml --output-dir-name ./output/grit_b_densecap --eval-only MODEL.WEIGHTS models/grit_b_densecap.pth

However, when installing the environment of DenseCap, I was stuck in the installation of torch on my GPU machine which has a CUDA version of 12.0. I always met this error:

Make Error: The following variables are used in this project, but they are set to NOTFOUND. Please set them or make sure they are set and tested correctly in the CMake files: CUDA_cublas_device_LIBRARY (ADVANCED) linked by target "THC" in directory /root/torch/extra/cutorch/lib/THC

Could you tell me what platform you use to install DenseCap and perform evaluation?

Thanks a lot!

JialianW commented 1 year ago

To evaluate the result, you can use the docker as provided in https://github.com/jcjohnson/densecap/issues/95

Wykay commented 1 year ago

Hi, Jialian.

I have built the environment for densecap using the docker image as provided. Could you please tell me more specifically how you use the output/grit_b_densecap/vg_instances_results.json or checkpoint file to evaluate on it so as to get the mAP result?

Found an evaluator of VG in python https://github.com/soloist97/densecap-pytorch/blob/bc81d9816ff8d4e45613846ad2acdf789acde37b/model/evaluator.py#L72

JialianW commented 1 year ago

In their Lua evaluation code, there is a place that obtains model inference results. We replace it with our result read from json.

Wykay commented 1 year ago

In their Lua evaluation code, there is a place that obtains model inference results. We replace it with our result read from json.

Thanks a lot.

Wykay commented 1 year ago

Hi, Jialian!

Have you ever met this problem : "attempt to concatenate a nil value" in eval_utils.lua ?

I met this when evaluating the model on image 63.jpg while reading ground truth

It seems likes that it is a problem caused by nn.LanguageModel.idx_to_token when the idx=10579, the token is "nil", which will cause error when being concatenated.

JialianW commented 1 year ago

I only changed the code to read the results json file, and no other changes. I didn't have errors in the evaluation code.

Wykay commented 1 year ago

Could you please provide me the evaluation code you edit so i could replace it in my docker? Thanks

amsword commented 1 year ago

Hi, Jialian!

Have you ever met this problem : "attempt to concatenate a nil value" in eval_utils.lua ?

I met this when evaluating the model on image 63.jpg while reading ground truth

It seems likes that it is a problem caused by nn.LanguageModel.idx_to_token when the idx=10579, the token is "nil", which will cause error when being concatenated.

Can you show more details on how you hit the issue? For example, share the full error stack.

Wykay commented 1 year ago

Hi, Jialian! Have you ever met this problem : "attempt to concatenate a nil value" in eval_utils.lua ? I met this when evaluating the model on image 63.jpg while reading ground truth It seems likes that it is a problem caused by nn.LanguageModel.idx_to_token when the idx=10579, the token is "nil", which will cause error when being concatenated.

Can you show more details on how you hit the issue? For example, share the full error stack.

I found the error: I didn't replace the vocab_size and idx_to_token in LM model, thank you all so much.

hellowordo commented 1 year ago

@EvenYYY 您好,请问一下您可以提供一下您编辑的用vg_instances_results.json文件在DenseCap评估的代码嘛,非常感谢您!两位大佬上面的聊天我还是没看太懂