Closed technillogue closed 1 year ago
How about a compromise: a context manager that yields a callable that accumulates debugging information (including timestamps, perhaps?) and emits them all on exit.
I like the basic idea, but overwriting builtins.print
is rather too clever for my liking.
maybe this is is okay and still works with prints inside the inference engine etc?
I think I'm going to go ahead with this and we can revisit this later
cog throttles output to send webhooks at most once every 50ms, and prints count as output, so we don't want to print before yielding the first token. this is a relatively unobstructive way to do that that ensures we can still see any debugging information if there are errors