Closed desmarchris closed 3 years ago
@TyDunn I wasn't sure what squad this would belong under, lmk if it's not enable :)
@desmarchris I'd check out this thread on Slack that discusses some related things
@kedz No, this will not affect RulePolicy, since I only touch the AugmentedMemoizationPolicy class and RulePolicy is derived from MemoizationPolicy.
I'd rephrase this issue to "make RulePolicy faster with large trackers". Working towards a solution would entail:
changed it slightly bc it's not just about time but also memory/cpu consumption
@desmarchris I'm trying to reproduce this right now. To be sure: the problem was with inference only, right? Not with training. Do we have an example tracker?
correct, just at inference. I'm sending you an example for a single prediction that's 2500 lines long on Slack.
Oh, I think the linear time increase is actually the issue. Rule policy prediction time increases roughly linearly with the tracker length.
But MappingPolicy
didn't check the tracker at all (except last message), so it'd be constant time for that one. Its this difference between MappingPolicy
and RulePolicy
that led to this issue. But we're generally talking about 1 s lag times. Can you confirm this, @desmarchris ?
the specific issue at hand here was literally printing 2500 lines to make a prediction caused a CPU spike that brought the pod close to its max capacity
Printing? So this has nothing to do with the policy, but only with the fact that debug was on?
I think so? but debug was needed to help find what went wrong with the bot. I don't know anyone running a bot in prod without debug turned on
I don't know anyone running a bot in prod without debug turned on
That is very concerning. It's not meant to be used this way. I suppose we can have RulePolicy just not print the state in the short term, and in the long term... find a way to store debug logs more quickly? @koaning had this idea of a streamlined debug log (--insights
?), but I suppose customers would just always go for printing everything. So we'd have to add a "secret debug flag" that is really only to be used for us inside Rasa.
I see we actually have a Issue https://github.com/RasaHQ/rasa/issues/7295 for that already.
I'll close the present issue now, noting that the abovementioned issue would fix the slow debug output, which caused the present issue in the first place.
If we wanted to actually accelerate RulePolicy
, we could speed up the lookup of rules. What makes this slightly complicated is that in contrast to "MappingPolicy rules", the "RulePolicy rules" can be nested. I.e. you can have a short rule inside a long rule. Even if the short rule applies, you still need to check if you can find a long rule that is more specific. Surely we can create a custom algorithm that is faster than the linear time one that we have right now. Again, given that this doesn't actually cause an issue with the customer, and the present issue is about scoping solutions, I'll close the present issue.
Description of Problem: Large trackers can cause a slowdown / memory spike when making a prediction with RulePolicy.
Definition of Done: