SeanScripts / ComfyUI-PixtralLlamaVision

For loading and running Pixtral models
Apache License 2.0
22 stars 5 forks source link

Molmo - ModuleNotFoundError: No module named 'tensorflow' #6

Open Iory1998 opened 3 hours ago

Iory1998 commented 3 hours ago

Hello, it's me again.

When I tried to use the Molmo-D model, I get the following error: ModuleNotFoundError: No module named 'tensorflow' image Any idea why?

Thank you :)

SeanScripts commented 3 hours ago

Yeah, I had this issue too. It's a bug in image_preprocessing_molmo.py:

A section of code, which says "FIXME remove", needs to be removed lol

Change this:

    # if resize_method == "tensorflow":
    #     FIXME remove
    import tensorflow as tf
    image = tf.image.convert_image_dtype(tf.constant(image), dtype=tf.float32)
    image = tf.image.resize(
        image,
        [scaled_height, scaled_width],
        method=tf.image.ResizeMethod.BILINEAR,
        antialias=True,
    )
    image = tf.clip_by_value(image, 0.0, 1.0)
    image = image.numpy()
    # else:
    #     image = torch.permute(torch.from_numpy(image), [2, 0, 1])
    #     image = convert_image_dtype(image)  # resize in flaot32
    #     image = torchvision.transforms.Resize(
    #         [scaled_height, scaled_width], InterpolationMode.BILINEAR, antialias=True
    #     )(image)
    #     image = torch.clip(image, 0.0, 1.0)
    #     image = torch.permute(image, [1, 2, 0]).numpy()

To this:

    # if resize_method == "tensorflow":
    #     FIXME remove
    #import tensorflow as tf
    #image = tf.image.convert_image_dtype(tf.constant(image), dtype=tf.float32)
    #image = tf.image.resize(
    #    image,
    #    [scaled_height, scaled_width],
    #    method=tf.image.ResizeMethod.BILINEAR,
    #    antialias=True,
    #)
    #image = tf.clip_by_value(image, 0.0, 1.0)
    #image = image.numpy()
    # else:
    image = torch.permute(torch.from_numpy(image), [2, 0, 1])
    image = convert_image_dtype(image)  # resize in flaot32
    image = torchvision.transforms.Resize(
        [scaled_height, scaled_width], InterpolationMode.BILINEAR, antialias=True
    )(image)
    image = torch.clip(image, 0.0, 1.0)
    image = torch.permute(image, [1, 2, 0]).numpy()
Iory1998 commented 1 hour ago

A section of code, which says "FIXME remove", needs to be removed lol

Haha! That was hilarious. OK, I will change that bit of the code. Could you please add an example of Captioning with Molmo? What about its chat template?

SeanScripts commented 1 hour ago

It seems to add the chat template stuff in the preprocessor already, so you can do it with just a prompt like "Describe this image." or "Caption this image."