Improve benchmark_models in evaluate_llm_tool.py

Kacper-W-Kozdon / promptflow_unify_integration

The tool package for Microsoft's Prompt flow and the VS Code extension

https://pypi.org/project/unify-integration/#description

MIT License

1 stars 2 forks source link

Improve benchmark_models in evaluate_llm_tool.py #4

Open KatoStevenMubiru opened 3 months ago

KatoStevenMubiru commented 3 months ago

Moving on to the benchmark_models tool in evaluate_llm_tool.py, and I had a few suggestions that might help improve it:

Docstring Accuracy: The docstring currently mentions a prompt_set parameter, but it looks like that's not actually used in the function. Could we update the docstring to accurately reflect the parameters the function takes and what it does? This would help avoid any confusion for users.
More Specific Error Handling: The try-except block catches a generic Exception. While this works, it might be better to catch more specific exceptions, like requests.exceptions.RequestException. This way, we can provide more informative error messages if something goes wrong with the API request.
API Key Handling: I noticed that the API key is currently hardcoded in the file and loaded using load_dotenv. PromptFlow seems to have some nice features for managing connections and API keys securely. Could we explore using those instead of hardcoding the key? This would make the tool more secure and easier to use within PromptFlow.

To me , I think these changes could make the benchmark_models tool more robust and user-friendly. What do you think?

Kacper-W-Kozdon commented 3 months ago

Moving on to the benchmark_models tool in evaluate_llm_tool.py, and I had a few suggestions that might help improve it:

Docstring Accuracy: The docstring currently mentions a prompt_set parameter, but it looks like that's not actually used in the function. Could we update the docstring to accurately reflect the parameters the function takes and what it does? This would help avoid any confusion for users.

More Specific Error Handling: The try-except block catches a generic Exception. While this works, it might be better to catch more specific exceptions, like requests.exceptions.RequestException. This way, we can provide more informative error messages if something goes wrong with the API request.

API Key Handling: I noticed that the API key is currently hardcoded in the file and loaded using load_dotenv. PromptFlow seems to have some nice features for managing connections and API keys securely. Could we explore using those instead of hardcoding the key? This would make the tool more secure and easier to use within PromptFlow.

To me , I think these changes could make the benchmark_models tool more robust and user-friendly. What do you think?

Docstrings are work in progress- but thanks for paying attention :D Error handling- likewise, I haven't checked which specific exception would be needed but yours is likely the correct one in that case. As for the API- currently it takes from the .env file, so it's not really hardcoded. Promptflow does this https://github.com/microsoft/promptflow/blob/d72dae3aa721a99e3793892bd972f4de96a71aaf/docs/how-to-guides/manage-connections.md?plain=1#L3 so basically .yaml is used instead of .env :D I'll read some more to decide if strong connection variant works better, right now it's CustomConnection only.

KatoStevenMubiru commented 3 months ago

Thanks for the info, I see that PromptFlow uses YAML files for connection management, similar to how .env files are used. That makes sense.

It looks like the CustomConnection approach might be the way to go for now, but it could be worth exploring the "strong connection variant" in the future if that aligns better with PromptFlow's direction.

I'll keep the YAML-based connection management in mind when reviewing the rest of the code.

Kacper-W-Kozdon commented 3 months ago

Thanks for the info, I see that PromptFlow uses YAML files for connection management, similar to how .env files are used. That makes sense.

It looks like the CustomConnection approach might be the way to go for now, but it could be worth exploring the "strong connection variant" in the future if that aligns better with PromptFlow's direction.

I'll keep the YAML-based connection management in mind when reviewing the rest of the code.

CustomConnection has a method to transform client into the strong connection variant, so it'll likely use the yamls. Temporarily, it's set to take the configs as the argument, but they likely can be read from the .yaml files. So it's halfway done (that method is in use) already.

KatoStevenMubiru commented 3 months ago

Good to know! 👍 Thanks for clarifying that CustomConnection can be transformed into a strong connection variant. That sounds like a flexible approach. I'm interested to see how that works with the YAML configurations.

I'll keep that in mind as I continue more reviews and suggestions

Kacper-W-Kozdon commented 3 months ago

Good to know! 👍 Thanks for clarifying that CustomConnection can be transformed into a strong connection variant. That sounds like a flexible approach. I'm interested to see how that works with the YAML configurations.

I'll keep that in mind as I continue more reviews and suggestions

The first and second suggested changes were applied, for the third one I still need to double-check the set-up of the strong connection. Feel free inspect the changes :)

KatoStevenMubiru commented 3 months ago

Thanks for the updates to the benchmark_models function—both the docstring and error handling look great! I also noticed that the API key is now being loaded from the YAML file, which is a solid improvement.

KatoStevenMubiru commented 3 months ago

I was thinking but just take it as a thought 😅.[May be wrong, may be right]

I wanted to touch base on the idea of exploring "strong connection variants" within PromptFlow. This could potentially offer better integration and security. Here are a few quick suggestions:

Research PromptFlow’s Connection Types: It might be worth looking into the different connection options in PromptFlow’s documentation to see if there’s a more integrated way to handle connections.
Extend CustomConnection: The UnifyConnection class in single_sign_on_tool.py is a good start. We might explore extending or modifying it to function as a stronger connection variant.
Prototype & Compare: If you have time, it could be helpful to prototype a version of the benchmark_models function using a stronger connection method. Comparing this with our current setup could provide insights into which approach is more secure and maintainable.

Here’s a quick conceptual example:

from promptflow.connections import CustomConnection

class UnifyStrongConnection(CustomConnection):
    def __init__(self, api_key: str):
        self.api_key = api_key

    def get_client(self):
        # Return a configured Unify client
        pass

Exploring these options could enhance our integration with PromptFlow. Let me know your thoughts!

Thanks again!