samtecspg / articulate

A platform for building conversational interfaces with intelligent agents (chatbots)
http://spg.ai/projects/articulate/
Apache License 2.0
599 stars 159 forks source link

Session Termination #998

Closed Neo032020 closed 4 years ago

Neo032020 commented 4 years ago

Issues

As we noticed that on Articulate Side, it is not performing Session Termination properly, due to this it got crashed to operate for all our created agents.

please do needful, With such problem and issues Articulate is not usable for the user.
We were not expecting such issue we will face.

Termination of a session and cleanup via the API is a basic function that is required of any and all applications. How quickly can this function be added? We can not use Articulate without and will be forced to switch to Rasa.

-Articulate v0.31.1 Release -Chrome

wrathagom commented 4 years ago

Not certain what that means, "not performing Session Termination properly". Is there an error?

prabhakar29may commented 4 years ago

Hi. If I start a new session with an agent through articulate API..then how can the session be terminated? Is there any limit to the number of sessions that can be started with an agent? How is garbage collection handled on abandoned sessions? And when do you consider a session as abandoned?

prabhakar29may commented 4 years ago

When I make a call using the Articulate API..then how do I make sure that the session that was started gets removed from the memory? Like a user with chat with my bot and then close it. How do I make sure that the session which was being used is released from the memory? There is a button in the UI with which you can delete a Session, I am just looking for deleting a session from the API..is that available? If a session is abandoned then when does it get released?

wrathagom commented 4 years ago

what session issue? Is there an error in the console? Terminal? How do I reproduce the error?

Neo032020 commented 4 years ago

Please provide response for user @prabhakar29may That is my main concern

stevemr03 commented 4 years ago

@wrathagom We are all from the same company working on using Articulate. We have found that the system crashes over time because we are not able to terminate sessions and remove them. @prabhakar29may description above explains that this is an issue using the API. This is a very high transaction solution we are working to deploy 50,000 to 100, 000 a day.

wrathagom commented 4 years ago

Articulate stores everything in memory in redis. Sessions do not need to be closed or deleted.

If you feel the need to delete sessions you can use the DELETE /context/{sessionID} endpoint

wrathagom commented 4 years ago

We've never had an agent with your throughput running on Articulate. Certainly could need some adjustments to better manage that volume. Let us know how we can help.

stevemr03 commented 4 years ago

Thanks you for the quick response. Have you performed any load testing or have any information on load metrics you think we should expect?

wrathagom commented 4 years ago

With that volume I expect Rasa to actually be the bottleneck. On our larger deploys we place Rasa beyond an nginx reverse proxy with shared drives and have 3-5 Rasa instances running.

stevemr03 commented 4 years ago

Yes, we are deploying multiple instances with load balancing. We were just wondering how many transactions you think one instance can support and what you would expect the response time to be under that full load. Any data you have would be helpful in our load testing. Thanks again for you help.

wrathagom commented 4 years ago

@stevemr03 can you describe your production configuration? the docker-compose provides an easy way to get a test instance up and running, but for a larger agent/more traffic you will need to begin to split the services off. So Elasticsearch running on it's own VM, Redis running on it's own, etc.

ngoel17 commented 4 years ago

@wrathagom What is your recommendation? How many simultaneous chats can the framework support before we start splitting out the system? What's the optimal sizing / partition when we split out. How much resources to allocate where?

wrathagom commented 4 years ago

@ngoel17 @stevemr03 we've been experimenting today for different deployments to find what's best. What can you all share about your previous deployment? Sounds like you all have been using Dialogflow? What were your monthly costs and usage stats there? When you say 5-10k users is that per hour? Simultaneous? How many turns does the conversation have? What's the average conversation length? The two most useful stats for us to gauge on would be simultaneous users and requests per second.

Also, what did you see that made you want to delete sessions? We have agents that have hit 100K + sessions over their lifetime and we've never needed to delete them before. Also, I think you've mentioned memory problems, but we've always found serving a large number of users to be CPU bound.

We've already identified a few places where we can bump up the RPS, but also need to know what we're targeting.

wrathagom commented 4 years ago

We'd be up for scheduling a conference call for Monday if you all are available.

stevemr03 commented 4 years ago

Caleb,

Thanks so much for the response. This application is specifically assessing the coronavirus and routing for a virtual medical visit. Most of the traffic will be calls of about 2.5 minutes. The 5,000 to 10,000 is a busy hour number, so we expect about 200-300 sessions max in a minute. The dialog is not complex, but does discern if the patient is a family member and sends a text with a televisit meeting. There are 9 questions with two that are open ended. We handle the dialog in code using the web hook. We have several other agents we are porting over that handle other tasks and our approach is to isolate each agent for their specific function to keep them small and accurate. Lots of great opportunities to improve patient care, office efficiency, and wellness.

Delete Sessions In our testing if we delete the prior session, it allows us to support more sessions simultaneously. If we do not delete the sessions, then they system slows down linearly. We did find that the Redis max memory was not set, so that was causing problems.

We have deployed this with mobile and web interfaces, but this is the first time we are making it work over a telephony UI. We have our own speech technology team (Govavice)

As we began rolling out with DialogFlow we became concerned about HIPAA compliance, untested telephony capability, and not sure it would really perform. So, we made a quick decision to switch to Articulate.

Please let me know if I have answered all of your questions. We like Articulate and are determined to make it work. Lot's of opportunities.

Best Regards,

Steve

On Fri, Mar 27, 2020 at 9:00 PM Caleb M. Keller notifications@github.com wrote:

@ngoel17 https://github.com/ngoel17 @stevemr03 https://github.com/stevemr03 we've been experimenting today for different deployments to find what's best. What can you all share about your previous deployment? Sounds like you all have been using Dialogflow? What were your monthly costs and usage stats there? When you say 5-10k users is that per hour? Simultaneous? How many turns does the conversation have? What's the average conversation length? The two most useful stats for us to gauge on would be simultaneous users and requests per second.

Also, what did you see that made you want to delete sessions? We have agents that have hit 100K + sessions over their lifetime and we've never needed to delete them before. Also, I think you've mentioned memory problems, but we've always found serving a large number of users to be CPU bound.

We've already identified a few places where we can bump up the RPS, but also need to know what we're targeting.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/samtecspg/articulate/issues/998#issuecomment-605378709, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK6REVQLQLGQTQM7IQVXME3RJVK4TANCNFSM4LUHNS6A .

-- Stephen Rothschild CEO, Enabledoc LLC w: 877.540.0933 ext 100 m: 540.588.0016

f: 703.459.9642enablemyhealth.com enabledoc.com This e-mail is from EnableDoc LLC and may contain information that is confidential or privileged. If you are not the intended recipient, do not read, copy or distribute the e-mail or any attachments. Instead, please notify the sender and delete the e-mail and any attachments. Thank you.

stevemr03 commented 4 years ago

Yes, what time?

On Fri, Mar 27, 2020 at 9:02 PM Caleb M. Keller notifications@github.com wrote:

We'd be up for scheduling a conference call for Monday if you all are available.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/samtecspg/articulate/issues/998#issuecomment-605378946, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK6REVQGXEARKXWQ6HL664TRJVLDTANCNFSM4LUHNS6A .

-- Stephen Rothschild CEO, Enabledoc LLC w: 877.540.0933 ext 100 m: 540.588.0016

f: 703.459.9642enablemyhealth.com enabledoc.com This e-mail is from EnableDoc LLC and may contain information that is confidential or privileged. If you are not the intended recipient, do not read, copy or distribute the e-mail or any attachments. Instead, please notify the sender and delete the e-mail and any attachments. Thank you.

wrathagom commented 4 years ago

Sorry for being silent, we haven't forgotten about you. We've done a round of performance enhancements. We should be able to publish all of those as a new version on Monday/Tuesday.

That being said, you still can't serve 10k customers an hour off an $80 a month VM.

The next change after some of the performance enhancements will be to add in multiple instances of Rasa as it's our current bottleneck.

If you were spending $400+ a month on Dialogflow I would recommend spinning up 3 VMs, experimenting on size of each until you get the performance where you want it.

On the first VM run:

On the second VM run:

On the third VM run:

stevemr03 commented 4 years ago

Thanks for all the responses. We have plans for multiple EC2.. We were wondering if we set up auto scaling should all the applications operate on self contained instances? We could design as you suggest but there are points of failure for the elasticsearch and NLU instances. We could put them in clusters as well. Our application server stores all the dialog data as well and can control across multiple instances.

On Fri, Apr 3, 2020 at 11:52 PM Caleb M. Keller notifications@github.com wrote:

Sorry for being silent, we haven't forgotten about you. We've done a round of performance enhancements. We should be able to publish all of those as a new version on Monday/Tuesday.

That being said, you still can't serve 10k customers an hour off an $80 a month VM.

The next change after some of the performance enhancements will be to add in multiple instances of Rasa as it's our current bottleneck.

If you were spending $400+ a month on Dialogflow I would recommend spinning up 3 VMs, experimenting on size of each until you get the performance where you want it.

On the first VM run:

  • api
  • ui
  • Duckling
  • Redis

On the second VM run:

  • Elasticsearch

On the third VM run:

  • Rasa

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/samtecspg/articulate/issues/998#issuecomment-608973813, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK6REVQUN6N4ZOGQPQI5HQ3RK24HTANCNFSM4LUHNS6A .

-- Stephen Rothschild CEO, Enabledoc LLC w: 877.540.0933 ext 100 m: 540.588.0016

f: 703.459.9642enablemyhealth.com enabledoc.com This e-mail is from EnableDoc LLC and may contain information that is confidential or privileged. If you are not the intended recipient, do not read, copy or distribute the e-mail or any attachments. Instead, please notify the sender and delete the e-mail and any attachments. Thank you.

wrathagom commented 4 years ago

ES is designed to operate in a cluster, so should be good there. And we've split between multiple Rasa instances before. Rasa isn't stateless (it stores the trained models) so you need to setup the rasa instances to share drive space, us S3 to sync files, or an advanced configuration where only one instance is the trainer and use rsync/etc to copy the model files out to all the parsers.

We've never tried to run the other services separately, so can't help there.

You can import/export agents so you could also duplicate the entire instance of Articulate. This would only work if you could guarantee that incoming sessions could be connected back to the same instance they were initially served with.

stevemr03 commented 4 years ago

Would you provide details of the changes made in the last releases? We see Rasa NLP set to 2 concurrent users. How often is Rasa called during a session? We are trying to understand what the settings mean.

On Fri, Apr 3, 2020 at 11:52 PM Caleb M. Keller notifications@github.com wrote:

Sorry for being silent, we haven't forgotten about you. We've done a round of performance enhancements. We should be able to publish all of those as a new version on Monday/Tuesday.

That being said, you still can't serve 10k customers an hour off an $80 a month VM.

The next change after some of the performance enhancements will be to add in multiple instances of Rasa as it's our current bottleneck.

If you were spending $400+ a month on Dialogflow I would recommend spinning up 3 VMs, experimenting on size of each until you get the performance where you want it.

On the first VM run:

  • api
  • ui
  • Duckling
  • Redis

On the second VM run:

  • Elasticsearch

On the third VM run:

  • Rasa

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/samtecspg/articulate/issues/998#issuecomment-608973813, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK6REVQUN6N4ZOGQPQI5HQ3RK24HTANCNFSM4LUHNS6A .

-- Stephen Rothschild CEO, Enabledoc LLC w: 877.540.0933 ext 100 m: 540.588.0016

f: 703.459.9642enablemyhealth.com enabledoc.com This e-mail is from EnableDoc LLC and may contain information that is confidential or privileged. If you are not the intended recipient, do not read, copy or distribute the e-mail or any attachments. Instead, please notify the sender and delete the e-mail and any attachments. Thank you.

wrathagom commented 4 years ago

Last commits? we haven't made any new releases.

Rasa is called 3 times for every call to converse. The setting isn't concurrent users, but rather concurrent requests to Rasa. We noticed that if Rasa receives too many requests at once they all go slow. So this will help feed them to Rasa one (or N) at a time.

From our testing the setting should be equal to the number of CPUs dedicated to Rasa, but we're still fine tuning.

stevemr03 commented 4 years ago

We opened another bug for this on the forum, but we are crashing with a converse error every day. Your team changed the setting to 2 concurrent Rasa requests and it appears to be causing this. We have 8 processors. Should we up this to 8 or is more work needed? The system is now unstable, please advise.

On Tue, Apr 7, 2020 at 11:58 AM Caleb M. Keller notifications@github.com wrote:

Last commits? we haven't made any new releases.

Rasa is called 3 times for every call to converse. The setting isn't concurrent users, but rather concurrent requests to Rasa. We noticed that if Rasa receives too many requests at once they all go slow. So this will help feed them to Rasa one (or N) at a time.

From our testing the setting should be equal to the number of CPUs dedicated to Rasa, but we're still fine tuning.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/samtecspg/articulate/issues/998#issuecomment-610504244, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK6REVRDCQSIUYGGKSN5UH3RLNLTTANCNFSM4LUHNS6A .

-- Stephen Rothschild CEO, Enabledoc LLC w: 877.540.0933 ext 100 m: 540.588.0016

f: 703.459.9642enablemyhealth.com enabledoc.com This e-mail is from EnableDoc LLC and may contain information that is confidential or privileged. If you are not the intended recipient, do not read, copy or distribute the e-mail or any attachments. Instead, please notify the sender and delete the e-mail and any attachments. Thank you.

wrathagom commented 4 years ago

I've had a very simple instance of Articulate running for more than a week now at a continuous simulated 250 active users (30 requests per second) and haven't seen any crashing our odd behavior.

Not saying we have it perfect, just that we're having trouble replicating any problems. For further help you'll have to try and capture the logs when the errors occur so we have more to go on.

We also released the multiple-rasa instances to master. With this you can have more than one Rasa instance serving. Rasa is the bottleneck, so scaling it out is required to get much past 50 simultaneous users.

wrathagom commented 4 years ago

Feel free to open new issues for anything else that comes up.