Error counting of relevant_line_number_to_insert_tests_after for java

Maxwell-FreeFOV commented 1 month ago

Hello,

I am working on trying Java projects but facing a serious problem. I found that many LLMs (except OpenAI GPTs), especially open-source ones (including Llama 70b, Code Llama, etc.), as well as some large parameter commercial models, return an incorrect relevant_line_number_to_insert_tests_after value for the analyze_suite_test_insert_line case.

For example, in the case of CalculatorControllerTest within templated_tests\java_spring_calculator, many LLMs return the following results:

Streaming results from LLM model... language: java testing_framework: JUnit number_of_tests: 2 relevant_line_number_to_insert_tests_after: 48 relevant_line_number_to_insert_imports_after: 14

The "relevant_line_number_to_insert_tests_after: 48" indicates the end of the test file, which is outside the class declaration. As a result, the cover-agent inserts test functions at the end of the file, leading to compile errors and causing the process to fall into a deadlock.

My questions are:

Has anyone else noticed this problem?
Can the prompt be improved to avoid this issue?
Can the process be made more robust to detect this problem earlier and possibly recover from it?

Thanks,

EmbeddedDevops1 commented 1 month ago

@Maxwell-FreeFOV Thanks for filing the issue. Where are you running these models?

Maxwell-FreeFOV commented 1 month ago

Use Ollama for accessing open-source models, and establish API connections for utilizing closed-source models.

EmbeddedDevops1 commented 1 month ago

So to be totally honest lower end, smaller models, don't seem to cut it with complex directions and high quality code generation. @mrT23 has mentioned this a few times in other issues that have been opened in the past. He's also got lots of benchmark data as well to back it up.

Perhaps you can tell us exactly what models you're using, the hardware you're using for it, and where you're running it and we can see if it will work on our end? Just as an FYI, lower end models such as a locally run Mistral or of Codellama on Ollama (using your run of the mill GPU) is not going to cut it. Do you have an enterprise set up (e.g. running an H100)?

francois-ouellet commented 4 weeks ago

I am facing the exact same issue with Azure's GPT-4o model. Most of the time the tests are inserted after the class' closing '}' which obviously results in failed tests.

jarvischen666 commented 3 weeks ago

I have the same issue. I'm using ollama paired with llama3.1 8B on an Nvidia 4080 graphics card.

EmbeddedDevops1 commented 3 weeks ago

Hi. We actually just issued a fix to the line insertion routine. Can you give this a whirl using both your project and the templated projects and see if the issue has been resolved?

francois-ouellet commented 2 weeks ago

@EmbeddedDevops1 - I would be happy to test the fix! But I can't find a new released version that would include that fix. I am using the binary version of this tool on macos.

francois-ouellet commented 2 weeks ago

I took some time to figure out how to rebuild the application locally and test the fix. I still see the same problem where the Java unit tests get added after the end of the test class definition.

I will look into cover-agent's code further today to see if I can figure out what the issue is.

Codium-ai / cover-agent

Error counting of relevant_line_number_to_insert_tests_after for java #128