Open dandansamax opened 1 year ago
What is more, we should require all unit tests that call LLM to provide text generated by LLM to simulate its output, so we can see in CI/CD whether the generated text is correctly formatted. Finally, before deciding to merge PR, we run a real output of LLM once.
Do we really need the real output of LLM? As I checking several famous agent projects, all of them just use mock outputs. A possible reson is most of tests do not require the output contents but only require their formats. Also they do not have so many tests as we have. I wonder if our strict tests limited our development progress?
Can we remove some tests that are actually useless?
Required prerequisites
Problem description
We have identified a critical issue where the OpenAI APIs are being invoked on every push to the Git repository. This behavior is highly inefficient and potentially costly, as it results in unnecessary API calls regardless of whether they are needed for the particular code changes being pushed.
The expected behavior is that the OpenAI APIs should only be triggered under specific conditions, such as before merging a PR.