Open DannyWeitekamp opened 3 years ago
I have also wanted this for a long time. Can we get it to work python's built in logging library that we've started add in places? Would love a solution we don't have to maintain the internals to. Also would be good to document some specifically useful tags throughout like at request or train or before returning an action etc.
This has been largely addressed by the recent merged pull request (#78 ). All un-commented print statements should now be removed and replaced with calls to the built in logging library. Please make sure to use the logging library structure in any future changes. @DannyWeitekamp's original comment is not technically addressed as that will probably require a modification to altrain to accept and forward the proper command line args, but I think this is sufficiently addressed here and I'll open an issue on the other repo.
I am re-opening this issue to discuss a more general solution to using the logging module throughout the library.
We have 2 different idioms for logging that are used throughout AL_Core:
al-agent
, al-performance
, and al-django
.__name__
, which given them names corresponding to their position in the packaging namespace.I assert that neither of these idioms are perfect for us. While idiom 2 is the recommended style from the python docs, it is very software engineer-centric. It requires you to know the structure of our packaging namespace, which is primarily designed around modularity rather than tracing execution, to meaningfully turn loggers on an off. If you are an end user and you're trying to debug why the actions your agent are firing are wrong you don't much care where in some package tree your exact WhenLearner is located, you want to follow the agent's reasoning. Idiom 1, is closer to this end-user case where logging statements correspond to major reasoning stages in an agent learning or performing rather than namespace structure. The downside here is that we are not very consistent in making these names, and currently lose out on leveraging some of the logging modules features like hierarchical log propagation (using .
delimited names).
My proposal is that we actually maintain both idioms but bring idiom 1 into line with the standard way of doing hierarchical logging (i.e., using .
delimited names) and be more intentional about what each of them is for. So in effect there would be two different hierarchical logging trees: apprentice
whose structure follows the packaging namespace and would be a way to target a specific class for software engineering level debugging, and al
whose structure would be based on tracing an agent's high-level execution. For example, I could see creating the following loggers:
al.agent.train
(or al.agent.learn
)- prints out stages of partial reasoning while an agent is learning (currently partly covered by al-agent
), likely includes initial state representation, each how/where/when learning mechanism, whether existing skills fit a training example, etc.al.agent.request
(or al.agent.act
) - prints out stages of partial reasoning related to an agent responding to a request to act (currently partly covered by al-agent
), likely includes initial state representation, the conflict set of skills to fire, etc.al.api
- printing related to processing on the django server (currently covered by al-django
).al.performance
- printing related to performance logging (currently covered by al-performance
)In addition to consistently using the two logging idioms above I would make the following standard changes across the library:
apprentice
would generally use debug()
to log, while loggers under al
would generally use info()
apprentice
and al
would be set to the WARN levelThis allows for control at different levels of the hierarchy. For example, in the proposal above you could activate both al.agent.train
and al.agent.request
by setting al.agent
to the INFO level.
A first pass at this is implemented in #79 and AL_Train#19 but I would like to see it structured differently. Rather than having to know a bunch of individual tags there should be params for the main optional logger levels: --debug
, --info
, and --warn
(implicitly you wouldn't be allowed to silence ERROR or CRITICAL) that each take a list of delimited logger names (either comma or space whatever works best for the argparser) that would be set to the given level on execution. This would give a caller the ability to fine tune what does and doesn't turn on. I could see just having 1 level parameter (debug is the lowest by default) but you get the idea.
+1 to all of the above. In a previous comment I had suggested that the convention be more like
altrain --debug-logging al.agent:debug al.performance:info
But your way may be more readable. However, I think it would be nice if altrain also took config files of along these lines similar to .gitignore in git where it will look for the config up the directory tree. The format of that could be as complicated as .yml files or as simple as a bunch of lines of var = value
I do think a single param per level rather than some kind of doubly delimited list is probably easier for end user control.
Agreed on continuing to use some kind of .yml file or other fine grain config, which I mainly envision is the use case for running things on a dedicated server, where you might want to not only more finely control which loggers are on but also where they are being piped.
Another dimension that came to mind, which you might call an implicit idiom 3, is the red/green/blue printing that altrain does by default to record agent action results. I'm inclined to say keep that as is but maybe we want to think about how this impacts logging out of the AL_Train server.
It is a pain to add and remove print statements, thus of course we should use logging. We've got a bit of logging stuff figured out but I would really like to be able to have a bunch of logging types and turn them on or off via some flag in altrain. For example: