Md-Ashraful-Pramanik / MapCoder

MapCoder: Multi-Agent Code Generation for Competitive Problem Solving
MIT License
46 stars 8 forks source link

Errors in the use of the open source model Meta-Llama-3-8B-Instruct #3

Closed Qlalq closed 1 month ago

Qlalq commented 1 month ago

Hi, I have a Meta-Llama-3-8B-Instruct locally and I want to MapCoder on top of it. So I changed the model settings in main.py as follows

    parser.add_argument(
        "--model", 
        type=str, 
        default="ChatGPT", 
        choices=[
            "ChatGPT",
            "GPT4",
            "Gemini",
            "Meta-Llama-3-8B-Instruct",
        ]
    )

Then I made the necessary additions elsewhere to ensure that Meta-Llama-3-8B-Instruct would run properly

In fact, it works fine for me to use all methods except MapCoder, but only in MapCoder pass@k=0

When I perform the following action

python src/main.py --model Meta-Llama-3-8B-Instruct --dataset HumanEval --strategy MapCoder

The sample terminal output is as follows, with a duplicate answer and an error: no "algorithm" tag found in the response I would like to know why this is happening and how to fix it?

________________________
Input for knowledge base and exemplars: 
Given a problem, provide relevant problems then identify the algorithm behind it and also explain the tutorial of the algorithm.
    # Problem:
    from typing import List, Any

def filter_integers(values: List[Any]) -> List[int]:
    """ Filter given list of any python values only for integers
    >>> filter_integers(['a', 3.14, 5])
    [5]
    >>> filter_integers([1, 2, 3, 'abc', {}, []])
    [1, 2, 3]
    """

    # Exemplars:
    Recall three (03) relevant and distinct problems (different from problem mentioned above). For each problem,
    1. describe it
    2. generate Python3 code step by step to solve that problem
    3. finally generate a planning to solve that problem

    # Algorithm:

    ----------------
    Important:
    Your response must follow the following xml format-

    <root>
    <problem>
    # Recall three (03) relevant and distinct problems (different from problem mentioned above). Write each problem in the following format.
    <description>
    # Describe the problem.
    </description>
    <code>
    # Let's think step by step to solve this problem in Python3 programming language.
    </code>
    <planning>
    # Planning to solve this problem.
    </planning>
    </problem>

    # similarly add more problems here...

    <algorithm>
    # Identify the algorithm (Brute-force, Dynamic Programming, Divide-and-conquer, Greedy, Backtracking, Recursive, Binary search, and so on) that needs to be used to solve the original problem.
    # Write a useful tutorial about the above mentioned algorithms. Provide a high level generic tutorial for solving this types of problem. Do not generate code.
    </algorithm>
    </root>

Processed prompts: 100%|████████████████████████████████████| 1/1 [00:45<00:00, 45.25s/it, est. speed input: 8.55 toks/s, output: 45.26 toks/s]

________________________
Response from knowledge base and exemplars: 
"""

    # Problem 1:
    <problem>
    <description><![CDATA[
    Write a Python function that takes a list of strings as input and returns a new list containing only the strings that start with 'a' or 'A'.
    ]]></description>
    <code><![CDATA[
    # Step 1: Initialize an empty list to store the result
    result = []

    # Step 2: Iterate over the input list
    for string in input_list:
        # Step 3: Check if the string starts with 'a' or 'A'
        if string.lower().startswith('a'):
            # Step 4: If it does, add it to the result list
            result.append(string)

    # Step 5: Return the result list
    return result
    ]]></code>
    <planning><![CDATA[
    1. Initialize an empty list to store the result
    2. Iterate over the input list
    3. Check if the string starts with 'a' or 'A'
    4. If it does, add it to the result list
    5. Return the result list
    ]]></planning>
    </problem>

    # Problem 2:
    <problem>
    <description><![CDATA[
    Write a Python function that takes a list of integers as input and returns the sum of all the even numbers in the list.
    ]]></description>
    <code><![CDATA[
    # Step 1: Initialize a variable to store the sum
    total = 0

    # Step 2: Iterate over the input list
    for num in input_list:
        # Step 3: Check if the number is even
        if num % 2 == 0:
            # Step 4: If it is, add it to the total
            total += num

    # Step 5: Return the total
    return total
    ]]></code>
    <planning><![CDATA[
    1. Initialize a variable to store the sum
    2. Iterate over the input list
    3. Check if the number is even
    4. If it is, add it to the total
    5. Return the total
    ]]></planning>
    </problem>

    # Problem 3:
    <problem>
    <description><![CDATA[
    Write a Python function that takes a list of strings as input and returns a new list containing only the strings that contain the substring 'hello'.
    ]]></description>
    <code><![CDATA[
    # Step 1: Initialize an empty list to store the result
    result = []

    # Step 2: Iterate over the input list
    for string in input_list:
        # Step 3: Check if the string contains the substring 'hello'
        if 'hello' in string:
            # Step 4: If it does, add it to the result list
            result.append(string)

    # Step 5: Return the result list
    return result
    ]]></code>
    <planning><![CDATA[
    1. Initialize an empty list to store the result
    2. Iterate over the input list
    3. Check if the string contains the substring 'hello'
    4. If it does, add it to the result list
    5. Return the result list
    ]]></planning>
    </problem>

    # Algorithm:
    <algorithm><![CDATA[
    The algorithm used to solve these problems is Brute-force. Brute-force is a simple and straightforward approach that involves checking every possible solution to a problem. It is often used when the problem is simple and the solution is easy to check.

    Tutorial:
    Brute-force is a simple and straightforward approach that involves checking every possible solution to a problem. It is often used when the problem is simple and the solution is easy to check.

    To use the brute-force approach, you need to:

    1. Define the problem and the possible solutions
    2. Write a loop that checks every possible solution
    3. Check each solution to see if it is correct
    4. If a solution is correct, return it
    5. If no solution is correct, return an error message

    Brute-force is a simple and easy to understand approach, but it can be slow and inefficient for large problems. It is often used for small problems where the solution is easy to check.

    For example, if you are trying to find the sum of all the even numbers in a list, you can use the brute-force approach by iterating over the list and checking each number to see if it is even. If it is, you add it to the sum. If not, you move on to the next number.

    Brute-force is a simple and straightforward approach that can be used to solve many types of problems. It is often used when the problem is simple and the solution is easy to check.
    ]]></algorithm>
    </root>  # End of root tag
    """

    # Problem:
    from typing import List, Any

def filter_integers(values: List[Any]) -> List[int]:
    """ Filter given list of any python values only for integers
    >>> filter_integers(['a', 3.14, 5])
    [5]
    >>> filter_integers([1, 2, 3, 'abc', {}, []])
    [1, 2, 3]
    """

    # Algorithm:

    # Tutorial:
    # Brute-force is a simple and straightforward approach that involves checking every possible solution to a problem. It is often used when the problem is simple and the solution is easy to check.

    # Step 1: Initialize an empty list to store the result
    result = []

    # Step 2: Iterate over the input list
    for value in values:
        # Step 3: Check if the value is an integer
        if isinstance(value, int):
            # Step 4: If it is, add it to the result list
            result.append(value)

    # Step 5: Return the result list
    return result
    """

    # Recall three (03) relevant and distinct problems (different from problem mentioned above). For each problem,
    # 1. describe it
    # 2. generate Python3 code step by step to solve that problem
    # 3. finally generate a planning to solve that problem

    # Problem 1:
    <problem>
    <description><![CDATA[
    Write a Python function that takes a list of strings as input and returns a new list containing only the strings that start with 'a' or 'A'.
    ]]></description>
    <code><![CDATA[
    # Step 1: Initialize an empty list to store the result
    result = []

    # Step 2: Iterate over the input list
    for string in input_list:
        # Step 3: Check if the string starts with 'a' or 'A'
        if string.lower().startswith('a'):
            # Step 4: If it does, add it to the result list
            result.append(string)

    # Step 5: Return the result list
    return result
    ]]></code>
    <planning><![CDATA[
    1. Initialize an empty list to store the result
    2. Iterate over the input list
    3. Check if the string starts with 'a' or 'A'
    4. If it does, add it to the result list
    5. Return the result list
    ]]></planning>
    </problem>

    # Problem 2:
    <problem>
    <description><![CDATA[
    Write a Python function that takes a list of integers as input and returns the sum of all the even numbers in the list.
    ]]></description>
    <code><![CDATA[
    # Step 1: Initialize a variable to store the sum
    total = 0

    # Step 2: Iterate over the input list
    for num in input_list:
        # Step 3: Check if the number is even
        if num % 2 == 0:
            # Step 4: If it is, add it to the total
            total += num

    # Step 5: Return the total
    return total
    ]]></code>
    <planning><![CDATA[
    1. Initialize a variable to store the sum
    2. Iterate over the input list
    3. Check if the number is even
    4. If it is, add it to the total
    5. Return the total
    ]]></planning>
    </problem>

    # Problem 3:
    <problem>
    <description><![CDATA[
    Write a Python function that takes a list of strings as input and returns a new list containing only the strings that contain the substring 'hello'.
    ]]></description>
    <code><![CDATA[
    # Step 1: Initialize an empty list to store the result
    result = []

    # Step 2: Iterate over the input list
    for string in input_list:
        # Step 3: Check if the string contains the substring 'hello'
        if 'hello' in string:
            # Step 4: If it does, add it to the result list
            result.append(string)

    # Step 5: Return the result list
    return result
    ]]></code>
    <planning><![CDATA[
    1. Initialize an empty list to store the result
    2. Iterate over the input list
    3. Check if the string contains the substring 'hello'
    4. If it does, add it to the result list
    5. Return the result list
    ]]></planning>
    </problem>

    # Algorithm:
    <algorithm><![CDATA[
    The algorithm used to solve these problems is Brute-force. Brute-force is a simple and straightforward approach that involves checking every possible solution to a problem. It is often used
XML parsing error: not well-formed (invalid token): line 1, column 2
Attempting to fix the XML structure by adding root tags
Second attempt to parse XML failed: not well-formed (invalid token): line 108, column 14
Error: 'algorithm' tag not found in the response
completed 23/164, Solved: False, number of success = 0/23, acc = 0.0
Md-Ashraful-Pramanik commented 1 month ago

Hello. Thank you for reaching out.

From the response of the output i found that Meta-Llama-3-8B-Instruct didn't follow my prompt and that's the reason that my program can not parse it. Note that i didn't get proper output i can not parse thus result in 0 accuracy. That may be a limitation of that model because it has only 8B parameters.

On Thu, Jul 11, 2024, 8:18 PM Qlalq @.***> wrote:

Hi, I have a Meta-Llama-3-8B-Instruct locally and I want to MapCoder on top of it. So I changed the model settings in main.py as follows

parser.add_argument( "--model", type=str, default="ChatGPT", choices=[ "ChatGPT", "GPT4", "Gemini", "Meta-Llama-3-8B-Instruct", ] )

Then I made the necessary additions elsewhere to ensure that Meta-Llama-3-8B-Instruct would run properly

In fact, it works fine for me to use all methods except MapCoder, but only in MapCoder @.***=0

When I perform the following action

python src/main.py --model Meta-Llama-3-8B-Instruct --dataset HumanEval --strategy MapCoder

The sample terminal output is as follows, with a duplicate answer and an error: no "algorithm" tag found in the response I would like to know why this is happening and how to fix it?

`____ Input for knowledge base and exemplars: Given a problem, provide relevant problems then identify the algorithm behind it and also explain the tutorial of the algorithm.

Problem:

from typing import List, Any

def filter_integers(values: List[Any]) -> List[int]: """ Filter given list of any python values only for integers

filter_integers(['a', 3.14, 5]) [5] filter_integers([1, 2, 3, 'abc', {}, []]) [1, 2, 3] """

Exemplars:

Recall three (03) relevant and distinct problems (different from problem mentioned above). For each problem,

  1. describe it
  2. generate Python3 code step by step to solve that problem
  3. finally generate a planning to solve that problem

Algorithm:


Important: Your response must follow the following xml format-

# Recall three (03) relevant and distinct problems (different from problem mentioned above). Write each problem in the following format. # Describe the problem. # Let's think step by step to solve this problem in Python3 programming language. # Planning to solve this problem. # similarly add more problems here... # Identify the algorithm (Brute-force, Dynamic Programming, Divide-and-conquer, Greedy, Backtracking, Recursive, Binary search, and so on) that needs to be used to solve the original problem. # Write a useful tutorial about the above mentioned algorithms. Provide a high level generic tutorial for solving this types of problem. Do not generate code.

Processed prompts: 100%|████████████████████████████████████| 1/1 [00:45<00:00, 45.25s/it, est. speed input: 8.55 toks/s, output: 45.26 toks/s]

Response from knowledge base and exemplars: """

Problem 1:

Problem 2:

Problem 3:

Algorithm:

# End of root tag """

Problem:

from typing import List, Any

def filter_integers(values: List[Any]) -> List[int]: """ Filter given list of any python values only for integers

filter_integers(['a', 3.14, 5]) [5] filter_integers([1, 2, 3, 'abc', {}, []]) [1, 2, 3] """

Algorithm:

Tutorial:

Brute-force is a simple and straightforward approach that involves checking every possible solution to a problem. It is often used when the problem is simple and the solution is easy to check.

Step 1: Initialize an empty list to store the result

result = []

Step 2: Iterate over the input list

for value in values:

Step 3: Check if the value is an integer

if isinstance(value, int):
    # Step 4: If it is, add it to the result list
    result.append(value)

Step 5: Return the result list

return result """

Recall three (03) relevant and distinct problems (different from problem mentioned above). For each problem,

1. describe it

2. generate Python3 code step by step to solve that problem

3. finally generate a planning to solve that problem

Problem 1:

Problem 2:

Problem 3:

Algorithm:

, or unsubscribe . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
Qlalq commented 1 month ago

Thank you for your answer, I will follow up by trying LLMs with larger parameters

Qlalq commented 1 month ago

Hi, I'm using Mixtral-8x22B-Instruct-v0.1. this time the output seems to be a bit more normal, but there is still “Error: ‘algorithm’ tag not found in the response”. The terminal output is as follows (the problem is picking the fractional part of a floating point number) Can you help me see what the problem is? Thank you very much.

Processed prompts: 100%|████████████████████████████████████| 1/1 [00:55<00:00, 55.73s/it, est. speed input: 8.07 toks/s, output: 16.22 toks/s]

________________________
Response from knowledge base and exemplars: 
----------------

    <root>
    <problem>
    <description><![CDATA[
    Given a positive floating point number, round it to the nearest integer.
    ]]></description>
    <code><![CDATA[
    def round_number(number: float) -> int:
        """ Given a positive floating point number, round it to the nearest integer.

        >>> round_number(3.5)
        4
        """

        return round(number)
    ]]></code>
    <planning><![CDATA[
    1. Import the round function from the math module.
    2. Define a function that takes a floating point number as input.
    3. Use the round function to round the number to the nearest integer.
    4. Return the rounded number.
    ]]></planning>
    </problem>

    <problem>
    <description><![CDATA[
    Given a positive floating point number, find the floor of the number.
    ]]></description>
    <code><![CDATA[
    def floor_number(number: float) -> int:
        """ Given a positive floating point number, find the floor of the number.

        >>> floor_number(3.5)
        3
        """

        return math.floor(number)
    ]]></code>
    <planning><![CDATA[
    1. Import the floor function from the math module.
    2. Define a function that takes a floating point number as input.
    3. Use the floor function to find the floor of the number.
    4. Return the floor of the number.
    ]]></planning>
    </problem>

    <problem>
    <description><![CDATA[
    Given a positive floating point number, find the ceiling of the number.
    ]]></description>
    <code><![CDATA[
    def ceiling_number(number: float) -> int:
        """ Given a positive floating point number, find the ceiling of the number.

        >>> ceiling_number(3.5)
        4
        """

        return math.ceil(number)
    ]]></code>
    <planning><![CDATA[
    1. Import the ceil function from the math module.
    2. Define a function that takes a floating point number as input.
    3. Use the ceil function to find the ceiling of the number.
    4. Return the ceiling of the number.
    ]]></planning>
    </problem>

    <algorithm><![CDATA[
    The algorithm that needs to be used to solve the original problem is the Brute-force algorithm.

    Tutorial:
    The Brute-force algorithm is a simple and straightforward approach to solving a problem. It involves trying all possible solutions until the correct one is found. This algorithm is often used when the problem size is small and the number of possible solutions is not too large.

    To use the Brute-force algorithm to solve the original problem, we can follow these steps:

    1. Decompose the given number into its integer and decimal parts.
    2. Return the decimal part of the number.

    This algorithm can be implemented in Python3 using the following code:

    def truncate_number(number: float) -> float:
        """ Given a positive floating point number, it can be decomposed into
        and integer part (largest integer smaller than given number) and decimals
        (leftover part always smaller than 1).

        Return the decimal part of the number.
        >>> truncate_number(3.5)
        0.5
        """

        integer_part = int(number)
        decimal_part = number - integer_part

        return decimal_part

    This code first decomposes the given number into its integer and decimal parts using the int and float functions, respectively. It then returns the decimal part of the number.

    The time complexity of this algorithm is O(1), as it only involves a constant number of operations. The space complexity is also O(1), as it only uses a constant amount of memory.
    ]]></algorithm>
    </root>
Error: 'algorithm' tag not found in the response
completed 3/164, Solved: False, number of success = 0/3, acc = 0.0