Closed Neo032020 closed 4 years ago
Not certain what that means, "not performing Session Termination properly". Is there an error?
Hi. If I start a new session with an agent through articulate API..then how can the session be terminated? Is there any limit to the number of sessions that can be started with an agent? How is garbage collection handled on abandoned sessions? And when do you consider a session as abandoned?
When I make a call using the Articulate API..then how do I make sure that the session that was started gets removed from the memory? Like a user with chat with my bot and then close it. How do I make sure that the session which was being used is released from the memory? There is a button in the UI with which you can delete a Session, I am just looking for deleting a session from the API..is that available? If a session is abandoned then when does it get released?
what session issue? Is there an error in the console? Terminal? How do I reproduce the error?
Please provide response for user @prabhakar29may That is my main concern
@wrathagom We are all from the same company working on using Articulate. We have found that the system crashes over time because we are not able to terminate sessions and remove them. @prabhakar29may description above explains that this is an issue using the API. This is a very high transaction solution we are working to deploy 50,000 to 100, 000 a day.
Articulate stores everything in memory in redis. Sessions do not need to be closed or deleted.
If you feel the need to delete sessions you can use the DELETE /context/{sessionID}
endpoint
We've never had an agent with your throughput running on Articulate. Certainly could need some adjustments to better manage that volume. Let us know how we can help.
Thanks you for the quick response. Have you performed any load testing or have any information on load metrics you think we should expect?
With that volume I expect Rasa to actually be the bottleneck. On our larger deploys we place Rasa beyond an nginx reverse proxy with shared drives and have 3-5 Rasa instances running.
Yes, we are deploying multiple instances with load balancing. We were just wondering how many transactions you think one instance can support and what you would expect the response time to be under that full load. Any data you have would be helpful in our load testing. Thanks again for you help.
@stevemr03 can you describe your production configuration? the docker-compose provides an easy way to get a test instance up and running, but for a larger agent/more traffic you will need to begin to split the services off. So Elasticsearch running on it's own VM, Redis running on it's own, etc.
@wrathagom What is your recommendation? How many simultaneous chats can the framework support before we start splitting out the system? What's the optimal sizing / partition when we split out. How much resources to allocate where?
@ngoel17 @stevemr03 we've been experimenting today for different deployments to find what's best. What can you all share about your previous deployment? Sounds like you all have been using Dialogflow? What were your monthly costs and usage stats there? When you say 5-10k users is that per hour? Simultaneous? How many turns does the conversation have? What's the average conversation length? The two most useful stats for us to gauge on would be simultaneous users and requests per second.
Also, what did you see that made you want to delete sessions? We have agents that have hit 100K + sessions over their lifetime and we've never needed to delete them before. Also, I think you've mentioned memory problems, but we've always found serving a large number of users to be CPU bound.
We've already identified a few places where we can bump up the RPS, but also need to know what we're targeting.
We'd be up for scheduling a conference call for Monday if you all are available.
Caleb,
Thanks so much for the response. This application is specifically assessing the coronavirus and routing for a virtual medical visit. Most of the traffic will be calls of about 2.5 minutes. The 5,000 to 10,000 is a busy hour number, so we expect about 200-300 sessions max in a minute. The dialog is not complex, but does discern if the patient is a family member and sends a text with a televisit meeting. There are 9 questions with two that are open ended. We handle the dialog in code using the web hook. We have several other agents we are porting over that handle other tasks and our approach is to isolate each agent for their specific function to keep them small and accurate. Lots of great opportunities to improve patient care, office efficiency, and wellness.
Delete Sessions In our testing if we delete the prior session, it allows us to support more sessions simultaneously. If we do not delete the sessions, then they system slows down linearly. We did find that the Redis max memory was not set, so that was causing problems.
We have deployed this with mobile and web interfaces, but this is the first time we are making it work over a telephony UI. We have our own speech technology team (Govavice)
As we began rolling out with DialogFlow we became concerned about HIPAA compliance, untested telephony capability, and not sure it would really perform. So, we made a quick decision to switch to Articulate.
Please let me know if I have answered all of your questions. We like Articulate and are determined to make it work. Lot's of opportunities.
Best Regards,
Steve
On Fri, Mar 27, 2020 at 9:00 PM Caleb M. Keller notifications@github.com wrote:
@ngoel17 https://github.com/ngoel17 @stevemr03 https://github.com/stevemr03 we've been experimenting today for different deployments to find what's best. What can you all share about your previous deployment? Sounds like you all have been using Dialogflow? What were your monthly costs and usage stats there? When you say 5-10k users is that per hour? Simultaneous? How many turns does the conversation have? What's the average conversation length? The two most useful stats for us to gauge on would be simultaneous users and requests per second.
Also, what did you see that made you want to delete sessions? We have agents that have hit 100K + sessions over their lifetime and we've never needed to delete them before. Also, I think you've mentioned memory problems, but we've always found serving a large number of users to be CPU bound.
We've already identified a few places where we can bump up the RPS, but also need to know what we're targeting.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/samtecspg/articulate/issues/998#issuecomment-605378709, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK6REVQLQLGQTQM7IQVXME3RJVK4TANCNFSM4LUHNS6A .
-- Stephen Rothschild CEO, Enabledoc LLC w: 877.540.0933 ext 100 m: 540.588.0016
f: 703.459.9642enablemyhealth.com enabledoc.com This e-mail is from EnableDoc LLC and may contain information that is confidential or privileged. If you are not the intended recipient, do not read, copy or distribute the e-mail or any attachments. Instead, please notify the sender and delete the e-mail and any attachments. Thank you.
Yes, what time?
On Fri, Mar 27, 2020 at 9:02 PM Caleb M. Keller notifications@github.com wrote:
We'd be up for scheduling a conference call for Monday if you all are available.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/samtecspg/articulate/issues/998#issuecomment-605378946, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK6REVQGXEARKXWQ6HL664TRJVLDTANCNFSM4LUHNS6A .
-- Stephen Rothschild CEO, Enabledoc LLC w: 877.540.0933 ext 100 m: 540.588.0016
f: 703.459.9642enablemyhealth.com enabledoc.com This e-mail is from EnableDoc LLC and may contain information that is confidential or privileged. If you are not the intended recipient, do not read, copy or distribute the e-mail or any attachments. Instead, please notify the sender and delete the e-mail and any attachments. Thank you.
Sorry for being silent, we haven't forgotten about you. We've done a round of performance enhancements. We should be able to publish all of those as a new version on Monday/Tuesday.
That being said, you still can't serve 10k customers an hour off an $80 a month VM.
The next change after some of the performance enhancements will be to add in multiple instances of Rasa as it's our current bottleneck.
If you were spending $400+ a month on Dialogflow I would recommend spinning up 3 VMs, experimenting on size of each until you get the performance where you want it.
On the first VM run:
On the second VM run:
On the third VM run:
Thanks for all the responses. We have plans for multiple EC2.. We were wondering if we set up auto scaling should all the applications operate on self contained instances? We could design as you suggest but there are points of failure for the elasticsearch and NLU instances. We could put them in clusters as well. Our application server stores all the dialog data as well and can control across multiple instances.
On Fri, Apr 3, 2020 at 11:52 PM Caleb M. Keller notifications@github.com wrote:
Sorry for being silent, we haven't forgotten about you. We've done a round of performance enhancements. We should be able to publish all of those as a new version on Monday/Tuesday.
That being said, you still can't serve 10k customers an hour off an $80 a month VM.
The next change after some of the performance enhancements will be to add in multiple instances of Rasa as it's our current bottleneck.
If you were spending $400+ a month on Dialogflow I would recommend spinning up 3 VMs, experimenting on size of each until you get the performance where you want it.
On the first VM run:
- api
- ui
- Duckling
- Redis
On the second VM run:
- Elasticsearch
On the third VM run:
- Rasa
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/samtecspg/articulate/issues/998#issuecomment-608973813, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK6REVQUN6N4ZOGQPQI5HQ3RK24HTANCNFSM4LUHNS6A .
-- Stephen Rothschild CEO, Enabledoc LLC w: 877.540.0933 ext 100 m: 540.588.0016
f: 703.459.9642enablemyhealth.com enabledoc.com This e-mail is from EnableDoc LLC and may contain information that is confidential or privileged. If you are not the intended recipient, do not read, copy or distribute the e-mail or any attachments. Instead, please notify the sender and delete the e-mail and any attachments. Thank you.
ES is designed to operate in a cluster, so should be good there. And we've split between multiple Rasa instances before. Rasa isn't stateless (it stores the trained models) so you need to setup the rasa instances to share drive space, us S3 to sync files, or an advanced configuration where only one instance is the trainer and use rsync/etc to copy the model files out to all the parsers.
We've never tried to run the other services separately, so can't help there.
You can import/export agents so you could also duplicate the entire instance of Articulate. This would only work if you could guarantee that incoming sessions could be connected back to the same instance they were initially served with.
Would you provide details of the changes made in the last releases? We see Rasa NLP set to 2 concurrent users. How often is Rasa called during a session? We are trying to understand what the settings mean.
On Fri, Apr 3, 2020 at 11:52 PM Caleb M. Keller notifications@github.com wrote:
Sorry for being silent, we haven't forgotten about you. We've done a round of performance enhancements. We should be able to publish all of those as a new version on Monday/Tuesday.
That being said, you still can't serve 10k customers an hour off an $80 a month VM.
The next change after some of the performance enhancements will be to add in multiple instances of Rasa as it's our current bottleneck.
If you were spending $400+ a month on Dialogflow I would recommend spinning up 3 VMs, experimenting on size of each until you get the performance where you want it.
On the first VM run:
- api
- ui
- Duckling
- Redis
On the second VM run:
- Elasticsearch
On the third VM run:
- Rasa
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/samtecspg/articulate/issues/998#issuecomment-608973813, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK6REVQUN6N4ZOGQPQI5HQ3RK24HTANCNFSM4LUHNS6A .
-- Stephen Rothschild CEO, Enabledoc LLC w: 877.540.0933 ext 100 m: 540.588.0016
f: 703.459.9642enablemyhealth.com enabledoc.com This e-mail is from EnableDoc LLC and may contain information that is confidential or privileged. If you are not the intended recipient, do not read, copy or distribute the e-mail or any attachments. Instead, please notify the sender and delete the e-mail and any attachments. Thank you.
Last commits? we haven't made any new releases.
Rasa is called 3 times for every call to converse. The setting isn't concurrent users, but rather concurrent requests to Rasa. We noticed that if Rasa receives too many requests at once they all go slow. So this will help feed them to Rasa one (or N) at a time.
From our testing the setting should be equal to the number of CPUs dedicated to Rasa, but we're still fine tuning.
We opened another bug for this on the forum, but we are crashing with a converse error every day. Your team changed the setting to 2 concurrent Rasa requests and it appears to be causing this. We have 8 processors. Should we up this to 8 or is more work needed? The system is now unstable, please advise.
On Tue, Apr 7, 2020 at 11:58 AM Caleb M. Keller notifications@github.com wrote:
Last commits? we haven't made any new releases.
Rasa is called 3 times for every call to converse. The setting isn't concurrent users, but rather concurrent requests to Rasa. We noticed that if Rasa receives too many requests at once they all go slow. So this will help feed them to Rasa one (or N) at a time.
From our testing the setting should be equal to the number of CPUs dedicated to Rasa, but we're still fine tuning.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/samtecspg/articulate/issues/998#issuecomment-610504244, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK6REVRDCQSIUYGGKSN5UH3RLNLTTANCNFSM4LUHNS6A .
-- Stephen Rothschild CEO, Enabledoc LLC w: 877.540.0933 ext 100 m: 540.588.0016
f: 703.459.9642enablemyhealth.com enabledoc.com This e-mail is from EnableDoc LLC and may contain information that is confidential or privileged. If you are not the intended recipient, do not read, copy or distribute the e-mail or any attachments. Instead, please notify the sender and delete the e-mail and any attachments. Thank you.
I've had a very simple instance of Articulate running for more than a week now at a continuous simulated 250 active users (30 requests per second) and haven't seen any crashing our odd behavior.
Not saying we have it perfect, just that we're having trouble replicating any problems. For further help you'll have to try and capture the logs when the errors occur so we have more to go on.
We also released the multiple-rasa instances to master. With this you can have more than one Rasa instance serving. Rasa is the bottleneck, so scaling it out is required to get much past 50 simultaneous users.
Feel free to open new issues for anything else that comes up.
Issues
please do needful, With such problem and issues Articulate is not usable for the user.
We were not expecting such issue we will face.
Termination of a session and cleanup via the API is a basic function that is required of any and all applications. How quickly can this function be added? We can not use Articulate without and will be forced to switch to Rasa.
-Articulate v0.31.1 Release -Chrome