Closed andrewldesousa closed 4 months ago
Hi @andrewldesousa, thank you for the detailed issue. Upon first glance, it looks like the role_information
is using snake case instead of camel case (roleInformation
), as is defined in the AzureDataSourceParameters
class. Can you try using camel case and see if it helps your issue? Otherwise, please let us know.
@moonbox3 thanks for pointing that out. going forward i will use camel case.
It seems I am getting similar behavior with camel case. Using the respond with humor system prompt, I get the answer "Europe is a continent that consists of many countries. Some of the countries in Europe include ." If I use the system prompt "You are an AI assistant that helps people find information." with camel case, I get a similar answer as to what I have screenshotted above.
@moonbox3 any update on this?
Hi @andrewldesousa , I currently work on the On Your Data team and have been tasked with trying to find a repro for your issue. :) I have been using a slightly edited version of the code snippet you shared above, and was not able to repro the issue by changing the system prompt in the way that you described. I am observing that the response using the "humor and jokes" prompt doesn't appear to be actually using humor, but for me the response is, at least, given in complete sentences.
I wanted to collect a little bit more info about your case above to see if there are any other configs / code impacting the output:
# message processing starts here
comment? The reason I ask this is to see if there is possibly anything happening while parsing the response to cause it to become truncated in the output.Here is the processing code:
async for message in chat_completion.complete_chat_stream_async(chat_messages, settings):
tool_message = await message.get_tool_message()
response = {
"id": "",
"model": "",
"created": 0,
"object": "",
"choices": [{
"messages": []
}],
"apim-request-id": "",
'history_metadata': history_metadata
}
response["id"] = str(uuid.uuid4())
response["model"] = AZURE_OPENAI_MODEL_NAME
response["created"] = int(time.time())
response["object"] = "extensions.chat.completion.chunk"
response["apim-request-id"] = headers.get("apim-request-id")
response["choices"][0]["messages"].append({
"role": "tool",
"content": tool_message
})
yield format_as_ndjson(response)
async for deltaText in message:
response = {
"id": "",
"model": "",
"created": 0,
"object": "",
"choices": [{
"messages": []
}],
"apim-request-id": "",
'history_metadata': history_metadata
}
response["id"] = str(uuid.uuid4())
response["model"] = AZURE_OPENAI_MODEL_NAME
response["created"] = int(time.time())
response["object"] = "extensions.chat.completion.chunk"
response["apim-request-id"] = headers.get("apim-request-id")
response["choices"][0]["messages"].append({
"role": "assistant",
"content": deltaText
})
yield format_as_ndjson(response)
my .env looks like this, sensitive values are ommitted: AZURE_SEARCH_SERVICE=byc-search AZURE_SEARCH_INDEX=index name AZURE_SEARCH_KEY= AZURE_SEARCH_USE_SEMANTIC_SEARCH=False AZURE_SEARCH_SEMANTIC_SEARCH_CONFIG=default AZURE_SEARCH_INDEX_IS_PRECHUNKED=False AZURE_SEARCH_TOP_K=5 AZURE_SEARCH_ENABLE_IN_DOMAIN=False AZURE_SEARCH_CONTENT_COLUMNS=content AZURE_SEARCH_FILENAME_COLUMN=title AZURE_SEARCH_TITLE_COLUMN=title AZURE_SEARCH_URL_COLUMN=title AZURE_SEARCH_VECTOR_COLUMNS= AZURE_SEARCH_QUERY_TYPE=simple AZURE_SEARCH_PERMITTED_GROUPS_COLUMN= AZURE_SEARCH_STRICTNESS=3 AZURE_OPENAI_RESOURCE= AZURE_OPENAI_MODEL=gpt-35-turbo-16k AZURE_OPENAI_KEY= AZURE_OPENAI_MODEL_NAME=gpt-35-turbo-16k AZURE_OPENAI_TEMPERATURE=0 AZURE_OPENAI_TOP_P=1.0 AZURE_OPENAI_MAX_TOKENS=1000 AZURE_OPENAI_STOP_SEQUENCE= AZURE_OPENAI_SYSTEM_MESSAGE=You are an AI assistant that helps people find information. AZURE_OPENAI_PREVIEW_API_VERSION=2023-12-01-preview AZURE_OPENAI_STREAM=True AZURE_OPENAI_ENDPOINT=https://byc-aoai.openai.azure.com/ AZURE_OPENAI_EMBEDDING_NAME= AZURE_COSMOSDB_ACCOUNT= AZURE_COSMOSDB_DATABASE=db_conversation_history AZURE_COSMOSDB_CONVERSATIONS_CONTAINER=conversations AZURE_COSMOSDB_ACCOUNT_KEY=
@abhahn thanks for the timely response, please let me know if you have further questions.
No problem! I made a few changes to my settings using the .env contents above, and am still not able to repro the issue, but I have a few more follow up questions to hopefully continue to narrow things down.
In the processing code that you recently shared, I see there is a function called format_as_ndjson
being used but I can't see the definition. I'm not sure if there is more going on here with processing the response which might impact the format. One thing I notice from just playing around with the ndjson
package is that printing the contents doesn't appear to be recursive, so complex objects within a dict aren't printing for me in a loop. Again, not sure what is really happening in that function or if you are using ndjson
, so maybe you can clarify that here.
One thing I am wondering if you can try is to add the following lines to the top of the data processing part as a way to debug the assistant message directly to see what is coming back. This is how I originally set up my repro script, and how I am able to see the full assistant message. Could you let me know if you are still seeing a truncated response printed here?
async for message in chat_completion.complete_chat_stream_async(chat_messages, settings):
tokens = [assistant_message async for assistant_message in message]
print("".join(tokens))
Thanks
By running
async for message in chat_completion.complete_chat_stream_async(chat_messages, settings):
tokens = [assistant_message async for assistant_message in message]
print("".join(tokens))
It prints "Some countries in Europe include ."
After experimenting more and double/triple texting the question I do get a better response but it's non-deterministic given the current settings i provided: "Please note that this is not an exhaustive list, and there are more countries in Europe. Some countries in Europe include:
Please note that this is not an exhaustive list, and there are more countries in Europe ."
So at the very least it does seem like an issue with not being able to effect change with system prompt. It's hard to tell why the answer is being cut short (maybe bad llm response) but i am still getting that issue with the updated version of my code similar to the snippet you provided above
Ok, I want to check to see if this is possibly related to another error we have observed recently with streaming requests, where the response payload is not completing properly.
Could you let me know the following additional pieces of info?
1) What region is your AOAI resource in? It may also help if you can share the full resource ID for your AOAI resource so I can see if I can find the requests in our logs.
2) Are you noticing any non-200 responses with debug logging enabled for Semantic Kernel? To see the logs I think you can just import logging
and set logging.basicConfig(level=logging.DEBUG)
at the top of app.py.
After trying for about ~5mins, it seems like it's hard to replicate. Maybe a new deployment of azureopenai service? I can try again tomorrow morning to see if I can recreate the issue of cutting off early which I was able to do ~4 hours ago, but I am leaning towards thinking a new version was deployed for my azureopenaiservice.
Region is East US The instance is named byc-aoai, under the BYCRG resource group under Microsoft's "Microsoft Azure Sponsorship 2" subscription.
@abhahn ok after trying again, i am no longer able to replicate the cutting off issue that I was able to produce yesterday. I suppose something on the backend changed.
If this issue is solved, then the only other outstanding issue would be the system prompt not taking effect.
This issue is stale because it has been open for 90 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
Describe the bug I am trying to use different system prompts for AzureChatCompletion.
I have tried passing different system prompts to AzureChatRequestSettings and AzureAISearchDataSources's role information parameter. Only the prompt "You are an AI assistant that helps people find information." seems to work well, and if I change the prompt, the response is being cut short or isn't correct in some way or doesn't have the intended effect of a system prompt.
For example, "What are some countries in Europe?" will give the answer "Some countries in Europe include ." if I give the system prompt to be a funny assistant that responds with jokes. By passing the prompt as roleInformation in the datasources dictionary without using semantic kernel (and directly calling AzureOpenAIService via self-made python functions), it seems to handle the prompt well and does respond with jokes.
After testing, I do not think this is an issue regarding prompt engineering but it seems that potentially the role_information parameter in AzureChatRequestSettings is not working correctly. I have also tried with appending the system message at the start of chat messages as well but this did not help.
To Reproduce
Screenshots Using the system prompt "You are an AI assistant that helps people find information."
Using the system prompt "You are an AI assistant that responds with humor and uses jokes in your answers."
Platform