microsoft / semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps
https://aka.ms/semantic-kernel
MIT License
21.31k stars 3.13k forks source link

Python: The Semantic Kernel Azure Chat Completion has an abnormal response time about 1.3% of the time #2133

Closed rm2631 closed 9 months ago

rm2631 commented 1 year ago

Describe the bug The Semantic Kernel Azure Chat Completion has an abnormal response time about 1.3% of the time. During testing, I noticed that approximately 4 out of 300 requests had a response time above 2 minutes, while the rest were between 2 and 8 seconds.

To Reproduce Steps to reproduce the behavior:

  1. Set up a Flask server using Semantic Kernel with the following configuration:

    • Flask version: 2.3.2
    • Semantic Kernel version: 0.3.4.dev0
  2. Make around 300 requests to the server with the specified prompt over 10 minutes with 3 virtual users using Postman.

  3. Observe the response times for each request.

Expected behavior The response time for the Semantic Kernel Azure Chat Completion should be consistent and reasonably fast for the given prompt.

Screenshots N/A (Since the issue is related to response times, screenshots may not be helpful)

Platform

Additional context The issue occurred when running the following code using the Flask server:

import semantic_kernel as sk
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion
from flask import Flask
import time
import json

deployment = "<DEPLOYMENT>"
endpoint = "<ENDPOINT>"
api_key = "<API_KEY>"

prompt = """
    You are a venture capital analyst. You've been given the following information:

    <I do not want to paste the context here because it's long.>

    Here is the question:
    {{$input}}
    """

app = Flask(__name__)

@app.route("/")
def direct():
    kernel = sk.Kernel()
    kernel.add_chat_service("dv", AzureChatCompletion(deployment, endpoint, api_key))
    summarize = kernel.create_semantic_function(
        prompt, max_tokens=2000, temperature=0.2, top_p=0.5
    )
    summary = summarize("Summarize Robert Steels Inc.'s activities?")

    return summary.result

if __name__ == "__main__":
    app.run(threaded=True)

Anyone else ran into this?

rm2631 commented 1 year ago

Here's the actual context in the prompt, all AI generated of course:

Robert Steels Inc. Achieves Record Production Milestone: The company celebrates surpassing two million tons of steel production, reinforcing its position as an industry leader.
New Leadership at Robert Steels Inc.: The company appoints a new CEO, bringing in a seasoned industry veteran to drive innovation and growth.
Community Outreach Initiative: Robert Steels Inc. launches a scholarship program for local students pursuing careers in engineering and metallurgy.
Advanced Research Center Inauguration: The company inaugurates a cutting-edge research center focused on developing groundbreaking steel technologies and materials.
Strategic Merger Announced: Robert Steels Inc. reveals plans to merge with a major mining company to secure a more stable raw material supply chain.
Industry Recognition: The American Steel Association honors Robert Steels Inc. with the "Steel Excellence Award" for exceptional contributions to the industry.
International Partnership: Robert Steels Inc. forms a joint venture with a prominent overseas firm to expand its global reach and market presence.
Employee Wellness Program Launch: The company introduces a comprehensive wellness program, including fitness classes and mental health support, to prioritize employee well-being.
Innovative Steel Bridge Project: Robert Steels Inc. partners with a renowned architectural firm to design a futuristic steel bridge concept for a major city.
Expansion into Renewable Energy Sector: The company diversifies its portfolio by investing in a solar panel manufacturing facility, exploring sustainable energy solutions.
Zero-Waste Initiative: Robert Steels Inc. commits to a zero-waste production goal, implementing recycling programs to reduce material waste.
AI-Driven Quality Control System: The company deploys an AI-powered quality control system to ensure consistent product excellence and minimize defects.
Charity Fundraiser Success: Robert Steels Inc. raises a substantial amount for a local charity supporting communities affected by natural disasters.
Employee Skill Development Program: The company introduces a comprehensive training program to enhance employee skills and expertise in specialized steel manufacturing processes.
Steel Art Exhibition Sponsorship: Robert Steels Inc. sponsors a steel-themed art exhibition, showcasing the versatility and beauty of steel as an artistic medium.
Industry Symposium Hosting: The company hosts an international steel symposium, inviting experts to share insights and advancements in the steel sector.
ISO Certification Achievement: Robert Steels Inc. receives ISO 9001:2023 certification for its outstanding quality management systems.
Humanitarian Aid to Developing Nations: The company donates steel materials for constructing essential infrastructure in developing countries.
Employee Stock Ownership Plan: Robert Steels Inc. introduces an employee stock ownership plan, allowing workers to have a stake in the company's success.
Steel Recycling Awareness Campaign: The company launches a public awareness campaign to promote the benefits of steel recycling and its positive impact on the environment.
Robert Steels Inc. Expands R&D Division: The company invests in a new state-of-the-art research and development facility to foster innovation and product diversification.
COVID-19 Relief Efforts: Robert Steels Inc. donates personal protective equipment and medical supplies to frontline healthcare workers during the pandemic.
International Steel Conference Participation: The company's experts present pioneering research at a global steel conference, showcasing the company's technical prowess.
Collaboration with Aerospace Giants: Robert Steels Inc. partners with leading aerospace companies to supply specialized steel components for next-gen aircraft.
Innovative Steel Coating Technology: The company unveils a breakthrough coating technology that enhances steel durability and corrosion resistance.
Employee Volunteer Program: Robert Steels Inc. launches a volunteer program, encouraging employees to contribute their time and skills to charitable causes.
Expanding Distribution Network: The company opens multiple new distribution centers to cater to increasing regional and international demand.
Robert Steels Inc. Launches E-commerce Platform: Customers can now purchase steel products directly from the company's user-friendly online store.
Industry-leading Workplace Safety Standards: The company receives recognition for maintaining top-tier safety protocols and low accident rates.
Steel Education Initiative: Robert Steels Inc. partners with local schools to provide educational resources and workshops about the steel manufacturing process.
Research on High-Strength Steel Alloys: The company collaborates with leading universities to develop high-strength steel alloys for demanding applications.
Robert Steels Inc. Celebrates 50th Anniversary: The company marks half a century of excellence in steel manufacturing and industry leadership.
Investment in Clean Energy: The company installs solar panels and adopts energy-efficient practices in its facilities to reduce its carbon footprint.
National Steel Day Celebrations: Robert Steels Inc. hosts an event for the public to showcase the significance of steel in everyday life and its manufacturing process.
Continuous Improvement Program: The company implements a continuous improvement initiative, streamlining processes to boost productivity and efficiency.
Award-Winning Ad Campaign: Robert Steels Inc. receives accolades for its creative and impactful advertising campaign promoting steel's versatility and importance.
Collaboration with Automotive Companies: The company collaborates with major automakers to supply steel for lightweight and fuel-efficient vehicle designs.
Robert Steels Inc. Launches Corporate Podcast: The podcast series delves into the world of steel and highlights industry trends and insights.
Mentorship Program for Young Engineers: The company introduces a mentorship initiative to guide aspiring engineers in the steel industry.
Smart Factory Implementation: Robert Steels Inc. invests in Industry 4.0 technologies to create a smart factory, optimizing production and supply chain operations.
Innovative Steel Packaging Solutions: The company introduces eco-friendly and efficient steel packaging alternatives for various industries.
Community Sustainability Project: Robert Steels Inc. partners with local communities to initiate sustainable development projects in the regions it operates.
Expansion of Steel Recycling Centers: The company opens additional steel recycling centers, encouraging responsible steel disposal and reuse.
Robert Steels Inc. Sponsors Steel Design Competition: The company supports young designers in developing creative and functional steel-based structures.
Technological Partnership with Silicon Valley: Robert Steels Inc. joins forces with Silicon Valley startups to integrate cutting-edge technologies into steel manufacturing.
Certification for Social Responsibility: The company achieves a prestigious social responsibility certification for its ethical business practices and community engagement.
evchaki commented 1 year ago

@rm2631 - thanks for bringing this up. We will see what we can find. Are you seeing any errors or events when the 3 slow items occur?

rm2631 commented 1 year ago

@evchaki in these instances, the llm ends up giving a valid answer. So, it works but it's much slower.

awharrison-28 commented 1 year ago

@rm2631 in the code block you posted, the only non-deterministic sequence is summary = summarize(...) which handles the requests to Azure OpenAI. My first thought was perhaps you're being rate-limited, but I also would have expected to see 429 errors instead of a delayed response. I'm also not sure how Flask handles exception raising when using threads. It sounds like you've verified that you have 300 responses for 300 requests though.

My second thought is that perhaps this is a Flask limitation. While you have set threading=True, there could be quite the request queue given the number of requests you are sending and how long they take to be served. Even 2-8 seconds to complete seems quite long to me - we see response times in milliseconds (note that internally, we do not route through Flask on any projects).

Have you tried monitoring the response times from Azure OpenAI? Without Flask? We will investigate on our side as well.

matthewbolanos commented 9 months ago

Closing this issue since it was created several months ago.