konveyor / kai

Konveyor AI - static code analysis driven migration to new targets via Generative AI
Apache License 2.0
23 stars 32 forks source link

We are urlencoding the source file contents before we send to LLM #213

Open jwmatthews opened 3 months ago

jwmatthews commented 3 months ago

We are starting to escape some portions of the source code we send to the LLM request. Example:

Below is an example of the source file contents in a prompt we constructed

Source file contents:
```java
package com.redhat.coolstore.model;

import java.io.Serializable;
import java.util.ArrayList;
import java.util.List;

import javax.enterprise.context.Dependent;

@Dependent
public class ShoppingCart implements Serializable {

    private static final long serialVersionUID = -1108043957592113528L;

    private double cartItemTotal;

    private double cartItemPromoSavings;

    private double shippingTotal;

    private double shippingPromoSavings;

    private double cartTotal;

    private List<ShoppingCartItem> shoppingCartItemList = new ArrayList<ShoppingCartItem>();

    public ShoppingCart() {

    }

    public List<ShoppingCartItem> getShoppingCartItemList() {
        return shoppingCartItemList;
    }

    public void setShoppingCartItemList(List<ShoppingCartItem> shoppingCartItemList) {
        this.shoppingCartItemList = shoppingCartItemList;
    }

    public void resetShoppingCartItemList() {
        shoppingCartItemList = new ArrayList<ShoppingCartItem>();
    }

    public void addShoppingCartItem(ShoppingCartItem sci) {

        if ( sci != null ) {

            shoppingCartItemList.add(sci);

        }

    }

    public boolean removeShoppingCartItem(ShoppingCartItem sci) {

        boolean removed = false;

        if ( sci != null ) {

            removed = shoppingCartItemList.remove(sci);

        }

        return removed;

    }

    public double getCartItemTotal() {
        return cartItemTotal;
    }

    public void setCartItemTotal(double cartItemTotal) {
        this.cartItemTotal = cartItemTotal;
    }

    public double getShippingTotal() {
        return shippingTotal;
    }

    public void setShippingTotal(double shippingTotal) {
        this.shippingTotal = shippingTotal;
    }

    public double getCartTotal() {
        return cartTotal;
    }

    public void setCartTotal(double cartTotal) {
        this.cartTotal = cartTotal;
    }

    public double getCartItemPromoSavings() {
        return cartItemPromoSavings;
    }

    public void setCartItemPromoSavings(double cartItemPromoSavings) {
        this.cartItemPromoSavings = cartItemPromoSavings;
    }

    public double getShippingPromoSavings() {
        return shippingPromoSavings;
    }

    public void setShippingPromoSavings(double shippingPromoSavings) {
        this.shippingPromoSavings = shippingPromoSavings;
    }

    @Override
    public String toString() {
        return "ShoppingCart [cartItemTotal=" + cartItemTotal
                + ", cartItemPromoSavings=" + cartItemPromoSavings
                + ", shippingTotal=" + shippingTotal
                + ", shippingPromoSavings=" + shippingPromoSavings
                + ", cartTotal=" + cartTotal + ", shoppingCartItemList="
                + shoppingCartItemList + "]";
    }
}
jwmatthews commented 3 months ago

I'm experimenting with a fix for this as below:

diff --git a/kai/data/templates/main.jinja b/kai/data/templates/main.jinja index ef97a41..a61a6e1 100644 --- a/kai/data/templates/main.jinja +++ b/kai/data/templates/main.jinja

@@ -26,7 +26,7 @@ After you have shared your step by step thinking, provide a full output of the u File name: "{{ src_file_name }}" Source file contents:

-{{ src_file_contents }}
+{{ src_file_contents | safe }} 

Issues

SaxenaAnushka102 commented 1 month ago

Hi @jwmatthews, This issue caught my interest, and I’m keen on contributing to it. I see you’re experimenting with a fix by using the | safe filter for the source file contents. Would you be open to me helping in any way? I’d love to assist in testing the solution or adding any enhancements you might have in mind. Looking forward to your guidance!

jwmatthews commented 1 month ago

Hi @SaxenaAnushka102 if you'd like to pick this up we could use help to test this further both with running against a LLM you have access to as well creating unit tests.

SaxenaAnushka102 commented 1 month ago

Hello @jwmatthews , Thank you for the opportunity! I've opened a PR (#317) for this issue. Looking forward to your feedback & guidance. Thanks!

jwmatthews commented 1 month ago

I've noticed that sometimes incident 'message' and 'solution' is urlencoded in the prompt when we do not want this. Especially see this with pom.xml

Example:

### incident 11
incident to fix: "Leverage a Maven profile to run the Quarkus native build adding the following section to the `pom.xml` file: 

 ```xml
 <profiles>
 <profile>
 <id>native</id>
 <activation>
 <property>
 <name>native</name>
 </property>
 </activation>
 <properties>
 <skipITs>false</skipITs>
 <quarkus.package.type>native</quarkus.package.type>
 </properties>
 </profile>
 </profiles>
 ```"

I'm experimenting with below to address

    diff --git a/kai/data/templates/main.jinja b/kai/data/templates/main.jinja
    index 68239d1..fd6926c 100644
    --- a/kai/data/templates/main.jinja
    +++ b/kai/data/templates/main.jinja
    @@ -26,17 +26,17 @@ After you have shared your step by step thinking, provide a full output of the u
     File name: "{{ src_file_name }}"
     Source file contents:
     ```{{ src_file_language }}
    -{{ src_file_contents }}
    +{{ src_file_contents | safe }}
     ```

     ## Issues

     {% for incident in incidents %}
     ### incident {{ loop.index0 }}
    -incident to fix: "{{ incident.message }}"
    +incident to fix: "{{ incident.message | safe }}"
     Line number: {{ incident.line_number }}
     {% if incident.solution_str is defined %}
    -{{ incident.solution_str }}
    +{{ incident.solution_str | safe }}
     {% endif %}
     {% endfor %}