DAMO-NLP-SG / VCD

[CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
Apache License 2.0
195 stars 9 forks source link

Unable to reproduce results from greedy search #8

Closed qychen2001 closed 3 months ago

qychen2001 commented 5 months ago

Hi, very nice work and clean realisation. But when I reproduce the results from the paper, I have some problems. Especially when reproducing VCD in using greedy search as decoding strategy. Under the generate() function, I made the following changes:

output_ids = model.generate(
    input_ids,
    images=image_tensor.unsqueeze(0).half().cuda(),
    images_cd=(image_tensor_cd.unsqueeze(0).half().cuda() if image_tensor_cd is not None else None),
    cd_alpha = args.cd_alpha,
    cd_beta = args.cd_beta,
    do_sample=True,
    temperature=args.temperature,
    top_p=None,
    top_k=None,
    max_new_tokens=1024,
    use_cache=True)

I found that when setting do_sample to False, it doesn't use VCD, after that I set both top_p and top_k to None, but the results I get with this setting are worse than if I had just used greedy search. I would like to know how the authors set it up to reproduce the results in the greedy search in the paper. Thanks for the attention and answer.

frankRenlf commented 4 months ago

that is because, evolve_vcd_sampling() just replace transformers.generation.utils.GenerationMixin.sample, it only walk through this way if you use do_sample=True

qychen2001 commented 4 months ago

that is because, evolve_vcd_sampling() just replace transformers.generation.utils.GenerationMixin.sample, it only walk through this way if you use do_sample=True

Thanks for the reply. I found this, so I set do_sample to True. also set top_p and top_k to None, but this does not reproduce the results in the paper using greedy. I would like to know how to use greedy, under the VCD setting. Thank you very much for your attention.

frankRenlf commented 4 months ago

that is because, evolve_vcd_sampling() just replace transformers.generation.utils.GenerationMixin.sample, it only walk through this way if you use do_sample=True

Thanks for the reply. I found this, so I set do_sample to True. also set top_p and top_k to None, but this does not reproduce the results in the paper using greedy. I would like to know how to use greedy, under the VCD setting. Thank you very much for your attention.

it can be found in transformers.generation.utils.GenerationMixin , in generate func . In vcd, it can work because the config will let program do return self.sample( input_ids, logits_processor=prepared_logits_processor, logits_warper=logits_warper, stopping_criteria=prepared_stopping_criteria, pad_token_id=generation_config.pad_token_id, eos_token_id=generation_config.eos_token_id, output_scores=generation_config.output_scores, return_dict_in_generate=generation_config.return_dict_in_generate, synced_gpus=synced_gpus, streamer=streamer, **model_kwargs, ) it you want use greedy, then you need to modify the config to let program go this way ` if generation_mode == GenerationMode.GREEDY_SEARCH:

11. run greedy search

        return self.greedy_search(
            input_ids,
            logits_processor=prepared_logits_processor,
            stopping_criteria=prepared_stopping_criteria,
            pad_token_id=generation_config.pad_token_id,
            eos_token_id=generation_config.eos_token_id,
            output_scores=generation_config.output_scores,
            return_dict_in_generate=generation_config.return_dict_in_generate,
            synced_gpus=synced_gpus,
            streamer=streamer,
            **model_kwargs,
        )` and also modify the greedy_search func like vcd do for sample
qychen2001 commented 4 months ago

that is because, evolve_vcd_sampling() just replace transformers.generation.utils.GenerationMixin.sample, it only walk through this way if you use do_sample=True

Thanks for the reply. I found this, so I set do_sample to True. also set top_p and top_k to None, but this does not reproduce the results in the paper using greedy. I would like to know how to use greedy, under the VCD setting. Thank you very much for your attention.

it can be found in transformers.generation.utils.GenerationMixin , in generate func . In vcd, it can work because the config will let program do return self.sample( input_ids, logits_processor=prepared_logits_processor, logits_warper=logits_warper, stopping_criteria=prepared_stopping_criteria, pad_token_id=generation_config.pad_token_id, eos_token_id=generation_config.eos_token_id, output_scores=generation_config.output_scores, return_dict_in_generate=generation_config.return_dict_in_generate, synced_gpus=synced_gpus, streamer=streamer, **model_kwargs, ) it you want use greedy, then you need to modify the config to let program go this way if generation_mode == GenerationMode.GREEDY_SEARCH: # 11. run greedy search return self.greedy_search( input_ids, logits_processor=prepared_logits_processor, stopping_criteria=prepared_stopping_criteria, pad_token_id=generation_config.pad_token_id, eos_token_id=generation_config.eos_token_id, output_scores=generation_config.output_scores, return_dict_in_generate=generation_config.return_dict_in_generate, synced_gpus=synced_gpus, streamer=streamer, **model_kwargs, ) and also modify the greedy_search func like vcd do for sample

Thank you very much for your reply. This may be a viable approach. I will give it a try.