konveyor / kai

Konveyor AI - static code analysis driven migration to new targets via Generative AI
Apache License 2.0
25 stars 32 forks source link

Consistent: "No codeblocks detected in LLM response" for several files with #350

Open jwmatthews opened 2 months ago

jwmatthews commented 2 months ago

I am seeing consistent and repeatable issues with several files in Coolstore when I run against claude 3.5 sonnet. It looks like the output stops suddenly midway through generating an update.

Config:

[models]
provider = "ChatBedrock"

[models.args]
model_id = "anthropic.claude-3-5-sonnet-20240620-v1:0"

Error snippet:

WARNING - 2024-09-04 06:50:33,902 - kai.models.file_solution - [    file_solution.py:95   - parse_file_solution_content()] - No codeblocks detected in LLM response
WARNING - 2024-09-04 06:50:33,907 - kai.service.kai_application.kai_application - [  kai_application.py:202  - get_incident_solutions_for_file()] - Request to model failed for batch 1/1 for src/main/java/com/redhat/coolstore/model/ShoppingCart.java with exception, retrying in 10.0s
Error in LLM Response: The LLM did not provide an updated file for src/main/java/com/redhat/coolstore/model/ShoppingCart.java

Attempting to convert:

prompt:

llm_result (all failures, stops prematurely)

Note on a subsequent retry it failed once more and then succeeded but the contents of what it generated are incomplete/truncated

1 more failure: https://gist.github.com/jwmatthews/7d7aac70a6b69291e2ff0ed2b467debb

Partial Success but Incomplete: https://gist.github.com/jwmatthews/0b366ffa4ff8fe2ed89638552e9972e9 It truncates the response and adds a comment // Rest of the class remains unchanged

package com.redhat.coolstore.service;

import java.util.Hashtable;
import java.util.logging.Logger;

import jakarta.ejb.Stateful;
import jakarta.inject.Inject;
import javax.naming.Context;
import javax.naming.InitialContext;
import javax.naming.NamingException;

import jakarta.enterprise.context.SessionScoped;
import java.io.Serializable;

import com.redhat.coolstore.model.Product;
import com.redhat.coolstore.model.ShoppingCart;
import com.redhat.coolstore.model.ShoppingCartItem;

@SessionScoped
public class ShoppingCartService implements Serializable {

    private static final long serialVersionUID = 1L;

    @Inject
    Logger log;

    @Inject
    ProductService productServices;

    @Inject
    PromoService ps;

    @Inject
    ShoppingCartOrderProcessor shoppingCartOrderProcessor;

    private ShoppingCart cart  = new ShoppingCart(); //Each user can have multiple shopping carts (tabbed browsing)

    public ShoppingCartService() {
    }

    // Rest of the class remains unchanged

    private static ShippingServiceRemote lookupShippingServiceRemote() {
        try {
            final Hashtable<String, String> jndiProperties = new Hashtable<>();
            jndiProperties.put(Context.INITIAL_CONTEXT_FACTORY, "org.wildfly.naming.client.WildFlyInitialContextFactory");

            final Context context = new InitialContext(jndiProperties);

            return (ShippingServiceRemote) context.lookup("ejb:/ROOT/ShippingService!" + ShippingServiceRemote.class.getName());
        } catch (NamingException e) {
            throw new RuntimeException(e);
        }
    }
}
jwmatthews commented 2 months ago

Related to:

dymurray commented 1 month ago

I'm seeing this consistently with bedrock and updating a big file. In order for the source code diff to actually render appropriately in in the IDE, I need the file in full. So I explicitly added into the prompt that I wanted the updated file in full, and it never has enough room in the response to give it to me.

jwmatthews commented 1 month ago

@dymurray when you access via Bedrock what model did you see issues with? I have used claude 3.5 sonnet and seen issues. To date we've done more testing with llama3 and mixtral and not much with claude 3.5 sonnet.

I have 2 initial thoughts:

I think it's very likely our issue is from not modifying the prompt sufficiently for Claude.

We can likely get more info on the context size by looking at response metadata. I have been working with @devjpt23 and he shared the below.

sample code from @devjpt23

ai_msg = llm.invoke(messages)
ai_msg.response_metadata['token_usage']['completion_tokens']

Example: {\\n\\t\' response_metadata={\'token_usage\': {\'completion_tokens\': 738, \'prompt_tokens\': 1122, \'total_tokens\': 1860, \'completion_time\': 1.192732671, \'prompt_time\': 0.056392911, \'queue_time\': 0.0009406290000000053, \'total_time\': 1.249125582}, \'model_name\': \'mixtral-8x7b-32768\', \'system_fingerprint\': \'fp_c5f20b5bb1\', \'finish_reason\': \'stop\', \'logprobs\': None}

jmontleon commented 1 month ago

I could be mistaken, but I don't think there is any intelligence with returning a response when hitting the token limit. They just return what they finished generating before hitting the limit. In the case of a streaming response they'll just stream until they hit it. It would make sense if this is what is happening.

jwmatthews commented 1 month ago

I could be mistaken, but I don't think there is any intelligence with returning a response when hitting the token limit. They just return what they finished generating before hitting the limit. In the case of a streaming response they'll just stream until they hit it. It would make sense if this is what is happening.

@jmontleon I agree, I had assumed no intelligence and model would stream and get cut off, yet when I saw this the model intentionally omitted code, so it wasn't cut off, it made a choice to strip code out and give me a condensed output.

    public ShoppingCartService() {
    }

    // Rest of the class remains unchanged

    private static ShippingServiceRemote lookupShippingServiceRemote() {
        try {
            final Hashtable<String, String> jndiProperties = new Hashtable<>();
            jndiProperties.put(Context.INITIAL_CONTEXT_FACTORY, "org.wildfly.naming.client.WildFlyInitialContextFactory");

            final Context context = new InitialContext(jndiProperties);

            return (ShippingServiceRemote) context.lookup("ejb:/ROOT/ShippingService!" + ShippingServiceRemote.class.getName());
        } catch (NamingException e) {
            throw new RuntimeException(e);
        }
    }
}
dymurray commented 1 month ago

I've seen the above behavior, and also just stopping midstream and cutting off.

I have been using

model_id = "meta.llama3-70b-instruct-v1:0"
jmontleon commented 1 month ago

We were able to find that modifying the config with the following increased the output result with bedrock. I believe @dymurray finally had success with smaller files using this, although results for larger files were still cut off.

[models.args]
model_id = "meta.llama3-70b-instruct-v1:0"
model_kwargs.max_gen_len = 2048

Unfortunately this is the max_gen_len for llama models on bedrock https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-meta.html

jwmatthews commented 1 month ago

Related to #391