Closed fivegrant closed 5 months ago
Example of the current performance:
Note that the example above shows that the prompting could use some improvement. Additionally, it might be nice to send the executed code to the Dev UI
@mattprintz @brandomr Here's an explanation of the state of tool toggling in this PR.
As the code currently exists, there are two approaches to disabling the tools in askem-beaker.
The first approach is to include the disabling lines in each context. For example:
class MimiContext(BaseContext):
TOOL_ENABLE_RUN_CODE = False
TOOL_ENABLE_ASK_USER = False
...
The other approach is to make a new context:
class TerariumContext(BaseContext):
TOOL_ENABLE_RUN_CODE = False
TOOL_ENABLE_ASK_USER = False
and have all of the BaseContexts inherit from it:
class MimiContext(TerariumContext):
...
If our policy is to avoid making any changes in askem-beaker, we can add in defaults specified by environment variables. Our .env
would contain lines like
DEFAULT_ENABLE_RUN_CODE=true
DEFAULT_ENABLE_ASK_USER=false
If there is no environment variable specified, it would default to true. As long as askem-beaker changed its environment variables, it wouldn't need to make any code changes to the contexts. A context could still use a tool despite the default like this:
class MimiContext(BaseContext):
TOOL_ENABLE_ASK_USER = True # overrides the default specified in the environment
...
Any of these solutions will require some kind of code change. The approaches in the With the current code in this PR section expect a change in askem-beaker and With a change to the code in this PR section expects a code change in beaker-kernel.
Personally, I prefer creating the new TerariumContext
. I think the first approach I outlined is messy an error prone. The code is very repetitive for behavior we want to generally apply to each context. As for the default environment change, this new toggling mechanism was created out of a desire to avoid environment variables and complicated, non-discoverable disabling behavior.
I think we should encourage developers to use a pattern where they create a class ProjectBaseContext(BaseContext)
that all other contexts inherit from. It's very clear whether a tool is enabled. If the child doesn't explicitly enable/disable, look at the parent. If the parent doesn't have it set either, then it's going to be enabled. I think this is an ideal pattern as opposed to using environment defaults. Environment defaults only seem like a good idea if we're trying to accommodate askem-beaker in the short term.
Another reason to make a TerariumContext
is we've discussed creating it before for other features (e.g. summarization). We might as well start moving towards growing it out incrementally and this PR seems like a good first step.
Ultimately though I'm fine with any direction we decide to go in
a desire to avoid environment variables
Can you expand on this a bit? Is this a desire to avoid all environment variables, or to avoid an explosion of tool/context specific environment variables, etc?
Why is it desirable to avoid environment variables? Particularly since when working with containers, env variables is usually a primary way of setting up configuration options and enabling/disabling behavior in different containers and/or environments (prod, staging, testing, etc)
@mattprintz If we want to vary the defaults based on deployment using the current code, we could always just
class TerariumContext(BaseContext):
TOOL_ENABLE_RUN_CODE = os.environ.get("DEFAULT_ENABLE_RUN_CODE".lower() == "true", False)
...
I'm not super against environment variables. It did seem like we were trying to avoid them if possible in previous discussions but I could have misinterpreted.
If we used environment variables, we have something like this
class TerariumContext(BaseContext):
# Uncomment the line below if you want to override `DEFAULT_ENABLE_RUN_CODE`
# TOOL_ENABLE_RUN_CODE = True
...
My preference is still (1) where you make your own environment default mechanism if you want one although I could be nudged to do something like (2). I'm inclined to keep the core functionality bare bones as possible and non-prescriptive as possible.
It did seem like we were trying to avoid them if possible in previous discussions but I could have misinterpreted.
What I was getting at with the earlier comments is that environment variables are server wide, so a single ENABLE_USER_PROMPT
environment variable doesn't have enough granularity for what we were looking for. I think I had thrown the idea of out per-context env variables as an option, so we could still use env variables and be granular enough, but that solution does have some flaws as well, so isn't perfect by any stretch.
In short, I never meant to move away from using env variables, just that we shouldn't use a single global env variable that would apply to all contexts.
I think we are very close here, I just think we need to clean up some of the semantics. Reading through, I was having trouble figuring out when the tools are available to run and when they aren't, I think I have figured it out now, but I think by renaming things we could make this clearer.
Mostly, I think the confusion is around TOOL_ENABLE_
as a prefix as to me this seems to imply that the tools referenced are disabled by default and by adding that you are enabling them, but in fact it's the opposite.
TOOL_ENABLE is a verb, implying that by using it, you are doing something, but TOOL_ENABLE_ASK_USER = True
is (usually) essentially a no-op. As the default is that it is already available. Expecting users to understand that disable a tool they should use TOOL_ENABLE_foo = False
feels non-intuitive.
There are a couple ways that I see that we can fix this, but I think the most intuitive option is to just switch it from describing an action that should be preformed to describing a state that represents the desired internal state of the context. I.e. to rename from TOOL_ENABLE_
to TOOL_ENABLED_
, or something semantically equivalent.
TOOL_ENABLED_ASK_USER = False
does not read like an English sentence and is awkward, but the semantic meaning is clearer.
TOOL_ASK_USER_ENABLED = False
reads better, but has an awkward structure with both a prefix and suffix.
Using advanced type hinting we could go with something like:
TOOL_ASK_USER = 'enabled'
where 'enabled' is defined as typing.Literal choice or even have the values be an enum such as perhaps TOOL_ASK_USER = ToolOverrdes.ENABLED
or somethng.
I'm open to your input on this.
@mattprintz what do you think of something like this?
>>> from enum import Enum
>>> class ToolToggle(Enum):
... ENABLE = True
... DISABLE = False
... def __bool__(self):
... return self.value
...
>>> # then inside the beaker-kernel lib we check something like
>>> is_enabled_by_dev = ToolToggle.DISABLE
>>> if not is_enabled_by_dev:
... print("tool disabled")
...
tool disabled
>>>
Also, do you think there's a better name than ToolOverrides
or ToolToggle
?
@mattprintz what do you think of something like this?> ... Also, do you think there's a better name than
ToolOverrides
orToolToggle
?
I think the tense is still wrong, and it should be "disabled" and "enabled" as those speak to the state that one want's it to be set which is the appropriate framework for talking about configuration options, which these are. The bool is redundant, but we shouldn't be checking by value anyway, We would want to refer to/compare against the enums always, not expect it to be a boolean option.
Your code is semantically no different than what you had before, really, just more complex.
Looks good to me. @fivegrant before we merge it can you just add a note somewhere in the docs about how the enabling/disabling works?
Sample environment variable changes in darpa-askem are shown here: https://github.com/DARPA-ASKEM/beaker-kernel/tree/feat/upgrade-beaker-kernel-lib. ENABLE_USER_PROMPT
needs to be replaced by TOOL_ENABLE_ASK_USER
. On the other hand, run_code
isn't enabled by what we currently have anyway (because the checkpointing env variable ENABLE_CHECKPOINTS
defaults to false) but I went ahead and put the env variables to be explicit
This PR provides a
run_code
tool to any agent using a checkpointable kernel. The specific changes made by this PR:toggeable_tool
. _NOTE: This is a breaking change since a few contexts usetoggleable_tool
in askem-beaker. However, the contexts using this are erroneously reimplementingask_user
anyway._run_code
tool implemented as a subkernel tool on CheckpointableSubkernelPRE-MERGE CHECKLIST
main
branchpyproject.toml
to use the new version