joeyz0z / ConZIC

Official implementation of "ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing"
MIT License
73 stars 17 forks source link

Missing Parameter 'logger' in control_generate_caption() Function #16

Closed morninghut closed 9 months ago

morninghut commented 11 months ago

Thanks for your remarkable work. I just want to try the demo in Colab, but I encountered an error related to function calling. Here's the code block:

# @title Run 
# img_path = upload_img_path if upload_your_image else example_img_path
img_path= '/content/ConZIC/examples/cat.png'
if args.run_type == 'caption':
    FinalCaption, BestCaption = run_caption(args, img_path, lm_model, lm_tokenizer, clip, token_mask, logger)
elif args.run_type == 'controllable':
    FinalCaption, BestCaption = run_control(run_type, args, img_path, lm_model, lm_tokenizer, clip, token_mask, logger)
else:
    raise Exception('run_type must be caption or controllable!')

The traceback is:

Processing: /content/ConZIC/examples/cat.png
Processing: /content/ConZIC/examples/cat.png
INFO:ConZIC:Processing: /content/ConZIC/examples/cat.png
DEBUG:PIL.PngImagePlugin:STREAM b'IHDR' 16 13
DEBUG:PIL.PngImagePlugin:STREAM b'gAMA' 41 4
DEBUG:PIL.PngImagePlugin:STREAM b'cHRM' 57 32
DEBUG:PIL.PngImagePlugin:STREAM b'bKGD' 101 6
DEBUG:PIL.PngImagePlugin:b'bKGD' 101 6 (unknown)
DEBUG:PIL.PngImagePlugin:STREAM b'IDAT' 119 32768
Sample 0: 
Sample 0: 
INFO:ConZIC:Sample 0: 

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

[<ipython-input-38-230aeb397fab>](https://localhost:8080/#) in <cell line: 4>()
      5     FinalCaption, BestCaption = run_caption(args, img_path, lm_model, lm_tokenizer, clip, token_mask, logger)
      6 elif args.run_type == 'controllable':
----> 7     FinalCaption, BestCaption = run_control(run_type, args, img_path, lm_model, lm_tokenizer, clip, token_mask, logger)
      8 else:
      9     raise Exception('run_type must be caption or controllable!')

[<ipython-input-30-6e35ae827346>](https://localhost:8080/#) in run_control(run_type, args, image_path, lm_model, lm_tokenizer, clip, token_mask, logger)
     25     for sample_id in range(args.samples_num):
     26         logger.info(f"Sample {sample_id}: ")
---> 27         gen_texts, clip_scores = control_generate_caption(lm_model, clip, lm_tokenizer, image_instance, token_mask, logger,
     28                                   prompt=args.prompt, batch_size=args.batch_size, max_len=args.sentence_len,
     29                                   top_k=args.candidate_k, temperature=args.lm_temperature,

TypeError: control_generate_caption() missing 1 required positional argument: 'logger'

It seems that the control_generate_caption() function is missing the logger parameter. I looked up the definition and found the following:

def control_generate_caption(img_name, model, clip, tokenizer,image_instance,token_mask,logger,#<-HERE
                     prompt="", batch_size=10, max_len=25,
                    top_k=100, temperature=1.0, max_iter=500,alpha=0.7,beta=1,gamma=5,
                    ctl_type="sentiment", style_type="positive",pos_type=None,generate_order="sequential"):
    # controllable funcitions to call

It appears to be a typo in the parameter list. I'm not sure about it. Any suggestions would be appreciated.

morninghut commented 9 months ago

I think I have found the problem. In function control_generate_caption, the first param is img_name.

def control_generate_caption(img_name, model, clip, tokenizer,image_instance,token_mask,logger,
                     prompt="", batch_size=10, max_len=25,
                    top_k=100, temperature=1.0, max_iter=500,alpha=0.7,beta=1,gamma=5,
                    ctl_type="sentiment", style_type="positive",pos_type=None,generate_order="sequential"):

However, in the offical colab .ipynb file, it call the function like this:

# in def run_control:
def run_control(run_type, args, image_path, lm_model, lm_tokenizer, clip, token_mask, logger):
    xxx...
        gen_texts, clip_scores = control_generate_caption(<NO IMG_NAME HERE!!!>lm_model, clip, lm_tokenizer, image_instance, token_mask, logger,
                                  prompt=args.prompt, batch_size=args.batch_size, max_len=args.sentence_len,
                                  top_k=args.candidate_k, temperature=args.lm_temperature,
                                  max_iter=args.num_iterations, alpha=args.alpha,
                                  beta=args.beta, gamma=args.gamma,
                                  ctl_type = args.control_type, style_type=args.sentiment_type,pos_type=args.pos_type, generate_order=args.order)

it just call the function missing a param img_name so the program read the params with a wrong way.

To fix the problem, just replace the code of def run_control:xxx like this:

def run_control(run_type, args, image_path, lm_model, lm_tokenizer, clip, token_mask, logger):
    FinalCaptionList = []
    BestCaptionList = []
    logger.info(f"Processing: {image_path}")
    image_instance = Image.open(image_path).convert("RGB")
    for sample_id in range(args.samples_num):
        logger.info(f"Sample {sample_id}: ")
        gen_texts, clip_scores = control_generate_caption(image_path,lm_model, clip, lm_tokenizer, image_instance, token_mask, logger,
                                  prompt=args.prompt, batch_size=args.batch_size, max_len=args.sentence_len,
                                  top_k=args.candidate_k, temperature=args.lm_temperature,
                                  max_iter=args.num_iterations, alpha=args.alpha,
                                  beta=args.beta, gamma=args.gamma,
                                  ctl_type = args.control_type, style_type=args.sentiment_type,pos_type=args.pos_type, generate_order=args.order)
        FinalCaptionStr = "Sample {}: ".format(sample_id + 1) + gen_texts[-2]
        BestCaptionStr = "Sample {}: ".format(sample_id + 1) + gen_texts[-1]
        FinalCaptionList.append(FinalCaptionStr)
        BestCaptionList.append(BestCaptionStr)
    return FinalCaptionList, BestCaptionList

XD