[Contributors wanted!] A model debugging utility

fchollet commented 1 year ago

I'd like to see a utility like this:

>>> model = ...
>>> debug_forward_pass(model, input_data=x)

As well as:

>>> model = 
>>> model.compile(...)
>>> debug_train_step(model, input_data=(x, y))

The utility would do the following:

Print-high level, concise info about the model, like number of layers and parameters
Print info about the input data (shapes / dtypes)
Set run_eargerly=True
Call the model in a try/except block and give you as much info / tips as possible in case of an exception
Use the GPT-3 API (prompted with the code for the model and the exception stack trace) to give you debugging tips

It might also wrap the __call__ method of every layer in the model to keep track of what was passed, the shape of input arguments, etc.

It might even give you code you can use to write a reusable unit test for your layer.

Who wants to implement this? Could just start from a simple Colab notebook implementation.

irvineAlgotrading commented 1 year ago

import tensorflow as tf
import openai
from functools import wraps

# Set your OpenAI API key
openai.api_key = "your_openai_api_key"

def debug_forward_pass(model, input_data):
    print_model_info(model)
    print_input_data_info(input_data)
    model.run_eagerly = True

    try:
        output = model(input_data)
    except Exception as e:
        print("Error during forward pass:", e)
        prompt_gpt3_for_debugging_tips(model, input_data, e)
    else:
        print("Forward pass successful.")
        return output

def debug_train_step(model, input_data):
    x, y = input_data
    print_model_info(model)
    print_input_data_info(x)
    model.run_eagerly = True

    try:
        with tf.GradientTape() as tape:
            predictions = model(x)
            loss = model.compiled_loss(y, predictions)
        gradients = tape.gradient(loss, model.trainable_variables)
        model.optimizer.apply_gradients(zip(gradients, model.trainable_variables))
    except Exception as e:
        print("Error during training step:", e)
        prompt_gpt3_for_debugging_tips(model, x, e)
    else:
        print("Training step successful.")

def print_model_info(model):
    print(f"Model Summary:")
    model.summary()

def print_input_data_info(input_data):
    print(f"Input data shape: {input_data.shape}")
    print(f"Input data dtype: {input_data.dtype}")

def prompt_gpt3_for_debugging_tips(model, input_data, error):
    code = generate_code_from_model(model)
    prompt = (
        f"I encountered the following error while running a TensorFlow model:\n\n"
        f"Error: {error}\n\n"
        f"Here is the code for the model:\n\n{code}\n\n"
        f"Here is the input data shape and dtype: {input_data.shape}, {input_data.dtype}\n\n"
        f"Can you please help me debug this issue and provide any tips?"
    )

    response = openai.Completion.create(
        engine="davinci-codex",
        prompt=prompt,
        max_tokens=200,
        n=1,
        stop=None,
        temperature=0.5,
    )

    message = response.choices[0].text.strip()
    print(f"GPT-3 debugging tips:\n{message}")

def generate_code_from_model(model):
    # This function should convert the model to a string representation of its code
    # You can either implement this yourself or use an existing library to generate the code
    pass

irvineAlgotrading commented 1 year ago

sorry for the broken formatting I'm mobile at the moment, this is a great starting point

radi-cho commented 1 year ago

@fchollet I made an overly simplistic starting point in Colab: https://colab.research.google.com/drive/11C9GZWE-NTPgPcl75irMZ19EUblzm5sH?usp=sharing.

Some takeaways:

OpenAI recommends using the gpt-3.5-turbo API for everything now since it is more powerful than GPT-3 but cheaper. That is why I used langchain with ChatGPT-based chain for my debugger. Its conversational nature facilitates the addition of model/layer details in separate user messages. Also, asking follow-up questions is possible.
I was struggling to figure out the best way to actually prompt a model summary to the LLM. Currently, I am passing the config from model.to_json + source code from inspect.getsource (a nightmare in Jupyter) for the layers that do not come from the Keras module. Here is the most important point of enhancements since the answers may get much better with better prompting. (Currently asking the LLM to make a unit test for a specific part of the model fails since model.to_json lacks information about output shapes for the previous layers after model building).
Have not yet played with wrapping __call__. This might make the concerns from the previous point less relevant.

Would love to hear some feedback. I can work on extending this further to a standalone Python library.

rajatmohan22 commented 1 year ago

I would love to work on this!

I could create a utility by wrapping layers with a custom function that tracks input/output shapes. Then, I'll use try/except blocks for error handling and use GPT-3 API for debugging tips. Here's a simplified code outline:

layer_wrapper(layer) modify_layers(model) try: model(input_data) # or model.fit(x, y) except Exception as e: get_gpt3_debugging_tips(e)

Could you assign it to me?

rajatmohan22 commented 1 year ago

Can i contribute? is this method ok?

I would love to work on this!

I could create a utility by wrapping layers with a custom function that tracks input/output shapes. Then, I'll use try/except blocks for error handling and use GPT-3 API for debugging tips. Here's a simplified code outline:

layer_wrapper(layer) modify_layers(model) try: model(input_data) # or model.fit(x, y) except Exception as e: get_gpt3_debugging_tips(e)

Could you assign it to me?

keras-team / tf-keras

[Contributors wanted!] A model debugging utility #101