Closed blackDZS closed 1 day ago
The shape of image_feature something error.
The number of image features should be a multiple of 576. Please debug along this line.
Thanks for your reply, I have debug the image features
in llava_arch.py
, the code as below change tensor size of image_feature
from 3584
to 7168
, I can't understand why
The number of image features should be a multiple of 576.
if "anyres_max" in image_aspect_ratio:
matched_anyres_max_num_patches = re.match(r"anyres_max_(\d+)", image_aspect_ratio)
if matched_anyres_max_num_patches:
max_num_patches = int(matched_anyres_max_num_patches.group(1))
if image_aspect_ratio == "anyres" or "anyres_max" in image_aspect_ratio:
if hasattr(self.get_vision_tower(), "image_size"):
vision_tower_image_size = self.get_vision_tower().image_size
else:
raise ValueError("vision_tower_image_size is not found in the vision tower.")
try:
num_patch_width, num_patch_height = get_anyres_image_grid_shape(image_sizes[image_idx], self.config.image_grid_pinpoints, vision_tower_image_size)
except Exception as e:
rank0_print(f"Error: {e}")
num_patch_width, num_patch_height = 2, 2
image_feature = image_feature.view(num_patch_height, num_patch_width, height, width, -1)
else:
image_feature = image_feature.view(2, 2, height, width, -1)
Could you give me the config.json from model? I also couldn't understand the number of 3584.
Very strange, I had this problem before, but it suddenly disappeared later, and now I can't reproduce the result
When I run
./eval.sh
, it rise the error as below