Using validation set loss for evaluation

weinman commented 2 years ago

I'd like to track the validation set loss for finetuning evaluation on a custom dataset (i.e., as shown in the original maskrcnn trainer here), rather than the COCO-style AP metrics.

Unfortunately, the evaluation data loader seems to be set up differently from the training data loader. In particular, the positive_map value used by the trainer does not seem to be created, and a branch of the code that attempts to cope with this absence fails.

Is there a good or easy way to accomplish this goal?

Starting from this line in GLIP's trainer.py, I've made a coarse attempt to see what is produced:

model.train()
with torch.no_grad():
  for i, batch in enumerate(val_data_loader):
    images, targets, image_ids, positive_map, *_ = batch
    images = images.to(device)
    if positive_map is None:
      loss_dict = model(images, targets)
    else:
      captions = [t.get_field("caption") for t in targets if "caption" in t.fields()]
      if len(captions) > 0:
        loss_dict = model(images, targets, captions, positive_map)
      else:
        loss_dict = model(images, targets)
      losses = sum(loss for loss in loss_dict.values())
      loss_dict_reduced = reduce_loss_dict(loss_dict)

All the evaluation batches come through without positive_map and the model does not seem to be in a state to accept the call without it. That is, the line loss_dict = model(images, targets) is the one called; it fails:

Traceback (most recent call last):
  File "./GLIP/tools/finetune.py", line 480, in <module>
    main()
  File "./GLIP/tools/finetune.py", line 455, in main
    model = train(
  File "./GLIP/tools/finetune.py", line 169, in train
    do_train(
  File "./GLIP/maskrcnn_benchmark/engine/trainer.py", line 306, in do_train
    loss_dict = model(images, targets)
  File "/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "./GLIP/maskrcnn_benchmark/modeling/detector/generalized_vl_rcnn.py", line 284, in forward
    proposals, proposal_losses, fused_visual_features = self.rpn(images, visual_features, targets, language_dict_features, positive_map,
  File "/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "./GLIP/maskrcnn_benchmark/modeling/rpn/vldyhead.py", line 905, in forward
    embedding = language_dict_features['embedded']
KeyError: 'embedded'

I've dug around the data loaders and pipeline extensively, but haven't quite figured out how to connect the dots appropriately.

Thanks for any input anyone can offer.

weinman commented 2 years ago

Brief update: I serendipitously recalled that the parameter dataset name suffix was necessary to make things work, for without it the error shown above was generated. Hence, I have added

DATASETS:
  TEST_DATASETNAME_SUFFIX:  _grounding

to the config and disabled the use of the default inference/evaluation method (since it seems to be incompatible with the resulting dataset) via

SOLVER:
  TEST_WITH_INFERENCE: False

The first change generates the requisite(?) positive_map value, and the second change doesn't entirely resolve the issue, because the else branch here also seems to involve operations incompatible with this evaluation data pipeline; that part needs to be excluded.

When it is, the loss_dict is properly returned.

My only remaining confusion/concern is why the captions in the list are not separated with the standard tokenizer-separator (e.g., ".") but instead appear merely as a sequence of words (aka, the class names).

Hopefully this helps anyone else who also wants to try.

weinman commented 2 years ago

Following up: The function that makes the data loader only uses the caption separator specified by the config file in training mode:

    if is_train:
        extra_args["separation_tokens"] = cfg.DATASETS.SEPARATION_TOKENS

(from maskrcnn_benchmark/data/build.py here.)

Why use this configuration option only in training mode? The lack of separators seems to negatively affect the evaluation during training.

I have adjusted this test to include:

    if is_train or (not is_train and cfg.DATASETS.TEST_DATASETNAME_SUFFIX=="_grounding"):
        extra_args["separation_tokens"] = cfg.DATASETS.SEPARATION_TOKENS

Is there a more appropriate way to do this? (I haven't been able to wrap my head around the data loaders.)

liunian-harold-li commented 2 years ago

Hi Jerod, sorry for the late reply. You are correct that the separation tokens should be set regardless of training/testing.

We haven't had this issue because we actually did not use the dataset's functions to do inference. In inference.py, we overrode the dataset class's functions (see the create_queries_and_maps_from_dataset function) and re-created the queries (this was to accommodate the difference prompt/inference strategies for difference evaluation datasets).

CDchenlin commented 1 year ago

Hi @weinman, I encounter the same promblem while finetuning the model on my own dataset. Here is the error

Traceback (most recent call last):
  File "~/Code/GLIP-main/tools/train_net.py", line 262, in <module>
    main()
  File "~/GLIP-main/tools/train_net.py", line 254, in main
    model = train(cfg=cfg,
  File "~/Code/GLIP-main/tools/train_net.py", line 129, in train
    do_train(
  File "~/Code/GLIP-main/maskrcnn_benchmark/engine/trainer.py", line 123, in do_train
    loss_dict = model(images, targets)
  File "~/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "~/Code/GLIP-main/maskrcnn_benchmark/modeling/detector/generalized_vl_rcnn.py", line 288, in forward
    proposals, proposal_losses, fused_visual_features = self.rpn(images, visual_features, targets, language_dict_features, positive_map,
  File "~/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "~/Code/GLIP-main/maskrcnn_benchmark/modeling/rpn/vldyhead.py", line 906, in forward
    embedding = language_dict_features['embedded']
KeyError: 'embedded'

As I read your issue, it seems that you have solved the problem. I have tried to add：

DATASETS:
  TEST_DATASETNAME_SUFFIX:  _grounding

and

SOLVER:
  TEST_WITH_INFERENCE: False

in my .yaml file. However, KeyError remains. Therefore, may I know whether you have any suggestions? Thank you so much！ Chenlin

weinman commented 1 year ago

@CDchenlin: Can you verify that you are indeed trying to apply the loss function (minimized for training) to the validation set? That was the context for my original question, error, and ultimate resolution.

If you are getting this error in just trying to train, it seems likely you don't have your training data set configured correctly and I would suggest closely examining any of the many example configs provided (e.g., COCO, LVIS, OdinW, etc.)

ppriyank commented 1 month ago

so whats the final solution for this?

microsoft / GLIP

Using validation set loss for evaluation #36