microsoft / TaskWeaver

A code-first agent framework for seamlessly planning and executing data analytics tasks.
https://microsoft.github.io/TaskWeaver/
MIT License
5.37k stars 688 forks source link

A plugin is automatically generated for each function at runtime if a plugin needed #397

Closed MRYingLEE closed 2 months ago

MRYingLEE commented 3 months ago

Is your feature request related to a problem? Please describe. A plugin is automatically generated for each function at runtime if a plugin needed.

So we can avoid "the discrepancy between the function signature in the Python implementation and the YAML file", as mentioned in https://microsoft.github.io/TaskWeaver/blog/plugin.

Describe the solution you'd like To only list the functions will be used and generate each plugin definition at run time. An even better solution is to discard the plugin concept and treat all functions as plugins.

Describe alternatives you've considered If plugin concept is still needed and it is hard to implement it at runtime. Maybe it can be generated at design time, which still mitigates the burden of developers.

liqul commented 3 months ago

Could you please explain a little more about 'implementing plugins at runtime'?

If I understand correctly, you mean something like generating the yaml definition of the plugin given only the python function. The problem is that we need a description of the function (sometimes with examples and also descriptions of the arguments and return values) which is consumed by the LLM to understand when to call it. Indeed, we can have developers adding more comments to the function so that we can extract the description from there, which however is not a good way to express this mandatory requirement. Any thoughts on this?

MRYingLEE commented 3 months ago

At runtime, actually you can get the information of a function easily. which should cover all your required information for YAML.

For example, for pandas.read_csv:

image

So maybe the YAML definition is not needed. Or if you really need it, you can generate it easily.

For each function, there is only one generation needed.

Although the cost of long prompt should be considered, it becomes less and less important for the token price is dropping quickly.

liqul commented 3 months ago

I don't think this could replace the YAML file of plugins. This only contains argument types and default values and the description on the functionality of this function is missing. Check the following example:

image

MRYingLEE commented 3 months ago

The signature of "Pandas.read_csv" has 22k characters, which covers all you need.

image

liqul commented 3 months ago

You cannot use one example to convince that all the functions are well documented. In addition, the plugins are not lib functions, they are implemented by the developers of the agent. The lib functions such as pd.read_csv can directly be used by the code interpreter.