haesleinhuepf / git-bob

git-bob uses AI to solve Github-issues. It runs inside the Github CI, no need to install anything on your computer.
BSD 3-Clause "New" or "Revised" License
16 stars 2 forks source link

Add new "paint" action #418

Closed haesleinhuepf closed 4 days ago

haesleinhuepf commented 5 days ago

We should add a new action in _ai_github_utilities.py in the function solve_github_issue. The action is called "paint" and is only invoked if the user explictly asked for painting or drawing a picture or comic. The action will call a new function "paint_image" which takes the prompt as input and stores and image to disk and adds the image file to the github repository.

To implement the underlying functionality of paint_image, we should take over the create function from here: https://github.com/haesleinhuepf/darth-d/blob/main/darth_d/_create.py and the images_from_url_responses from https://github.com/haesleinhuepf/darth-d/blob/main/darth_d/_utilities.py . The paint_image function should go into _ai_github_utilities.py. The images_from_url_responses should go into _utilities.py .

git-bob think about how to implement this in the python code in this repository.

github-actions[bot] commented 5 days ago

This message was generated by git-bob (version: 0.11.2dev, model: gpt-4o-2024-08-06, log), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully.

To implement the "paint" action, you need to add the following changes to the existing files:

  1. Add images_from_url_responses to _utilities.py:
def images_from_url_responses(response, input_shape=None):
    """Turns a list of OpenAI's URL responses into numpy images"""
    from skimage.io import imread
    from skimage import transform
    import numpy as np

    images = [imread(item.url) for item in response.data]

    if input_shape is not None:
        images = [transform.resize(image, input_shape, anti_aliasing=True, preserve_range=True).astype(image.dtype) for image in images]

        if len(input_shape) == 2 and len(images[0].shape) == 3:
            images = [image[:,:,0] for image in images]

    if len(images) == 1:
        return images[0]
    else:
        return np.asarray(images)
  1. Implement the paint_image function in _ai_github_utilities.py:

Add this function:

def paint_image(prompt, output_path="image.png"):
    """Generate an image based on a prompt and save it to disk."""
    from ._create import create
    import numpy as np
    import matplotlib.pyplot as plt

    image = create(prompt=prompt, model="dall-e-3")
    if isinstance(image, np.ndarray):
        plt.imsave(output_path, image.astype(np.uint8))
  1. Modify solve_github_issue to include the "paint" action:

Within the solve_github_issue function, detect when the "paint" action should be used and call paint_image.

def solve_github_issue(repository, issue, llm_model, prompt_function, base_branch=None):
    # ... existing code ...
    if "paint" in discussion:  # Check if "paint" action should be invoked
        paint_image(prompt="Your image description here")
        Config.git_utilities.add_file_to_repository(repository, "image.png", "Adding generated image")
    # ... existing code ...

These changes introduce the image generation feature based on a textual prompt and integrate it into the existing system. Make sure to replace "Your image description here" with the actual text based on issue discussions.

haesleinhuepf commented 5 days ago

Modify the code so that is uses PIL image instead of numpy arrays for the image handling.

git-bob comment on how to implement this.

github-actions[bot] commented 5 days ago

This message was generated by git-bob (version: 0.11.2dev, model: gpt-4o-2024-08-06, log), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully.

To modify the code to use PIL Image instead of numpy arrays, update the paint_image function as follows:

def paint_image(prompt, output_path="image.png"):
    """Generate an image based on a prompt and save it to disk using PIL."""
    from ._create import create
    from PIL import Image

    image = create(prompt=prompt, model="dall-e-3")
    if isinstance(image, Image.Image):
        image.save(output_path)

Ensure that the create function from the darth_d repository returns a PIL Image object instead of a numpy array.

haesleinhuepf commented 5 days ago

How would the create function look like? You can take it from here: https://github.com/haesleinhuepf/darth-d/blob/main/darth_d/_create.py

git-bob comment on how to to this

github-actions[bot] commented 5 days ago

This message was generated by git-bob (version: 0.11.2dev, model: gpt-4o-2024-08-06, log), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully.

To adjust the create function to return a PIL Image instead of a numpy array, update it as follows:

  1. Modify the images_from_url_responses function in _utilities.py to return a PIL Image:
def images_from_url_responses(response, input_shape=None):
    """Turns a list of OpenAI's URL responses into PIL images"""
    from skimage.io import imread
    from skimage import transform
    from PIL import Image

    images = [imread(item.url) for item in response.data]

    if input_shape is not None:
        images = [transform.resize(image, input_shape, anti_aliasing=True, preserve_range=True).astype(image.dtype) for image in images]

    pil_images = [Image.fromarray(image) for image in images]

    if len(pil_images) == 1:
        return pil_images[0]
    else:
        return pil_images
  1. Update the create function in _create.py to use this modified version:
from stackview import jupyter_displayable_output
from ._utilities import images_from_url_responses

@jupyter_displayable_output(library_name='darth-d', help_url='https://github.com/haesleinhuepf/darth-d')
def create(prompt:str=None, image_width:int=1024, image_height:int=1024, num_images:int=1, model:str="dall-e-3", style:str='vivid', quality:str='standard'):
    from openai import OpenAI

    client = OpenAI()
    size_str = f"{image_width}x{image_height}"

    kwargs = {}
    if model == "dall-e-3":
        kwargs['style'] = style
        kwargs['quality'] = quality

    response = client.images.generate(
        prompt=prompt,
        n=num_images,
        model=model,
        size=size_str,
        **kwargs
    )

    return images_from_url_responses(response)

These changes convert the image data to PIL Image format for more versatile handling of image operations.

haesleinhuepf commented 5 days ago

Awesome! git-bob implement this, but remove the decorator "jupyter_displayable_output"