Open weinman opened 2 years ago
Brief update: I serendipitously recalled that the parameter dataset name suffix was necessary to make things work, for without it the error shown above was generated. Hence, I have added
DATASETS:
TEST_DATASETNAME_SUFFIX: _grounding
to the config and disabled the use of the default inference/evaluation method (since it seems to be incompatible with the resulting dataset) via
SOLVER:
TEST_WITH_INFERENCE: False
The first change generates the requisite(?) positive_map
value, and the second change doesn't entirely resolve the issue, because the else
branch here also seems to involve operations incompatible with this evaluation data pipeline; that part needs to be excluded.
When it is, the loss_dict
is properly returned.
My only remaining confusion/concern is why the captions in the list are not separated with the standard tokenizer-separator (e.g., ".
") but instead appear merely as a sequence of words (aka, the class names).
Hopefully this helps anyone else who also wants to try.
Following up: The function that makes the data loader only uses the caption separator specified by the config file in training mode:
if is_train:
extra_args["separation_tokens"] = cfg.DATASETS.SEPARATION_TOKENS
(from maskrcnn_benchmark/data/build.py
here.)
Why use this configuration option only in training mode? The lack of separators seems to negatively affect the evaluation during training.
I have adjusted this test to include:
if is_train or (not is_train and cfg.DATASETS.TEST_DATASETNAME_SUFFIX=="_grounding"):
extra_args["separation_tokens"] = cfg.DATASETS.SEPARATION_TOKENS
Is there a more appropriate way to do this? (I haven't been able to wrap my head around the data loaders.)
Hi Jerod, sorry for the late reply. You are correct that the separation tokens should be set regardless of training/testing.
We haven't had this issue because we actually did not use the dataset's functions to do inference. In inference.py, we overrode the dataset class's functions (see the create_queries_and_maps_from_dataset function) and re-created the queries (this was to accommodate the difference prompt/inference strategies for difference evaluation datasets).
Hi @weinman, I encounter the same promblem while finetuning the model on my own dataset. Here is the error
Traceback (most recent call last):
File "~/Code/GLIP-main/tools/train_net.py", line 262, in <module>
main()
File "~/GLIP-main/tools/train_net.py", line 254, in main
model = train(cfg=cfg,
File "~/Code/GLIP-main/tools/train_net.py", line 129, in train
do_train(
File "~/Code/GLIP-main/maskrcnn_benchmark/engine/trainer.py", line 123, in do_train
loss_dict = model(images, targets)
File "~/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "~/Code/GLIP-main/maskrcnn_benchmark/modeling/detector/generalized_vl_rcnn.py", line 288, in forward
proposals, proposal_losses, fused_visual_features = self.rpn(images, visual_features, targets, language_dict_features, positive_map,
File "~/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "~/Code/GLIP-main/maskrcnn_benchmark/modeling/rpn/vldyhead.py", line 906, in forward
embedding = language_dict_features['embedded']
KeyError: 'embedded'
As I read your issue, it seems that you have solved the problem. I have tried to add:
DATASETS:
TEST_DATASETNAME_SUFFIX: _grounding
and
SOLVER:
TEST_WITH_INFERENCE: False
in my .yaml
file. However, KeyError remains. Therefore, may I know whether you have any suggestions? Thank you so much!
Chenlin
@CDchenlin: Can you verify that you are indeed trying to apply the loss function (minimized for training) to the validation set? That was the context for my original question, error, and ultimate resolution.
If you are getting this error in just trying to train, it seems likely you don't have your training data set configured correctly and I would suggest closely examining any of the many example configs provided (e.g., COCO, LVIS, OdinW, etc.)
so whats the final solution for this?
I'd like to track the validation set loss for finetuning evaluation on a custom dataset (i.e., as shown in the original maskrcnn trainer here), rather than the COCO-style AP metrics.
Unfortunately, the evaluation data loader seems to be set up differently from the training data loader. In particular, the
positive_map
value used by the trainer does not seem to be created, and a branch of the code that attempts to cope with this absence fails.Is there a good or easy way to accomplish this goal?
Starting from this line in GLIP's
trainer.py
, I've made a coarse attempt to see what is produced:All the evaluation batches come through without
positive_map
and the model does not seem to be in a state to accept the call without it. That is, the lineloss_dict = model(images, targets)
is the one called; it fails:I've dug around the data loaders and pipeline extensively, but haven't quite figured out how to connect the dots appropriately.
Thanks for any input anyone can offer.