ParadoxZW / LLaVA-UHD-Better

A bug-free and improved implementation of LLaVA-UHD, based on the code from the official repo
Apache License 2.0
32 stars 3 forks source link

question about hw_patch_nums #8

Open WeihuangLin opened 2 months ago

WeihuangLin commented 2 months ago

Hello. I'm currently using an image to verify the part of the code about high resolution processing.

When I run the code:image_tensor, hw_patch_nums = process_image(image, image_mean=image_mean,image_std=image_std )

hw_patch_nums According to this code is supposed to return a list containing three dimensions, slice_h_num, slice_w_num, and j(position).

But when I pass image_tensor and hw_patch_nums as arguments to the classes in adapt_clip.py, I run into a bug. that's where code I found that at this point, hw_patch_nums only has two dimensions and is missing information about the third dimension (j position).

May I ask if this has been processed in any other way in between?

Looking forward to your reply!