Azure / azure-sdk-for-python

This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/python/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-python.
MIT License
4.56k stars 2.78k forks source link

Azure ServiceBus error - The format code '0x6E' at frame buffer offset '65772' is invalid or unexpected #37556

Open chandrakiranv opened 5 days ago

chandrakiranv commented 5 days ago

Library name and version

azure-servicebus 7.12.3

Query/Question

We have our Azure ServiceBus streaming failing in the middle of data streaming process with the below error

azure.servicebus._pyamqp.error.AMQPLinkError: Error condition: amqp:decode-error Error Description: The format code '0x6E' at frame buffer offset '65772' is invalid or unexpected.

When we worked with Microsoft, they suggested issue could be with payload but we haven't made any changes to payload. Only change we did is before we used to have 3 individual jobs streaming 3 types of data(All 3 are similar structure wise). But since this is consuming 3X of RUs from our Cosmos, we merged all these 3 jobs into a single streaming job(Which now reads only once from Cosmos) which sends the data into 3 different ServiceBus topics

Issue behavior - Suppose if we have 10M data volume that needs to be sent into ServiceBus topics, this job which is failing runs for few hours, streams 4-6M and then fails with above error.

Any troubleshooting tips or experience with handling similar issue would be greatly appreciated

Environment

Azure ServiceBus

github-actions[bot] commented 5 days ago

Thank you for your feedback. Tagging and routing to the team member best able to assist.

kashifkhan commented 5 days ago

Hi @chandrakiranv ,

When did this issue start happening for you, was it after upgrading to the newest version of azure-servicebus or was this happening before too?

Would you be able to provide debug logs and a way to reproduce this error by following the steps below https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/servicebus/azure-servicebus#troubleshooting

Which version of python are you using?

A way to reproduce the error and/or logs would be the best way for us to be able to help troubleshoot this problem.

github-actions[bot] commented 5 days ago

Hi @chandrakiranv. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

chandrakiranv commented 4 days ago

Hi @kashifkhan

Yes we have observed this error after upgrading to the latest version of azure-servicebus but when we tested with the previous version also the same error is happening. I have attached the complete trace of the log related to this error AMQP_error_Trace.txt

We have enabled the logging already as suggested above and the python version is 3.10.12

kashifkhan commented 3 days ago

Hi @chandrakiranv ,

Given that this error is happening on both versions, I would lean towards something wrong with the body that is being sent across.

2024-09-24 15:30:20,214 | INFO | MainThread | azure.identity.aio._credentials.environment | No environment configuration found.
2024-09-24 15:30:20,215 | INFO | MainThread | azure.identity.aio._credentials.managed_identity | ManagedIdentityCredential will use IMDS
2024-09-24 15:30:20,636 | INFO | MainThread | azure.servicebus._pyamqp.aio._connection_async | Connection state changed: None -> <ConnectionState.START: 0>
2024-09-24 15:30:20,682 | DEBUG | MainThread | azure.servicebus._pyamqp.aio._connection_async | -> Header(b'AMQP\x00\x01\x00\x00')
2024-09-24 15:30:20,683 | INFO | MainThread | azure.servicebus._pyamqp.aio._connection_async | Connection state changed: <ConnectionState.START: 0> -> <ConnectionState.HDR_SENT: 2>
2024-09-24 15:30:20,684 | INFO | MainThread | azure.servicebus._pyamqp.aio._connection_async | Connection state changed: <ConnectionState.HDR_SENT: 2> -> <ConnectionState.HDR_SENT: 2>
2024-09-24 15:30:20,684 | DEBUG | MainThread | azure.servicebus._pyamqp.aio._connection_async | -> OpenFrame(container_id='SBSender-836965da-e1f0-4b0f-b376-0a8724f439b0', hostname='kashifksbprem.servicebus.windows.net', max_frame_size=1048576, channel_max=65535, idle_timeout=None, outgoing_locales=None, incoming_locales=None, offered_capabilities=None, desired_capabilities=None, properties={'product': 'azsdk-python-servicebus', 'version': '7.12.3', 'framework': 'Python/3.8.20', 'platform': 'Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35', 'user-agent': 'azsdk-python-servicebus/7.12.3 pyamqp/2.0.0a1 Python/3.8.20 (Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35)'})
2024-09-24 15:30:20,685 | INFO | MainThread | azure.servicebus._pyamqp.aio._connection_async | Connection state changed: <ConnectionState.HDR_SENT: 2> -> <ConnectionState.OPEN_PIPE: 4>
2024-09-24 15:30:20,686 | DEBUG | MainThread | azure.servicebus._pyamqp.aio._session_async | -> BeginFrame(remote_channel=None, next_outgoing_id=0, incoming_window=65536, outgoing_window=65536, handle_max=4294967295, offered_capabilities=None, desired_capabilities=None, properties=None)
2024-09-24 15:30:20,705 | INFO | MainThread | azure.servicebus._pyamqp.aio._session_async | Session state changed: <SessionState.UNMAPPED: 0> -> <SessionState.BEGIN_SENT: 1>
2024-09-24 15:30:20,705 | DEBUG | MainThread | azure.servicebus._pyamqp.aio._link_async | -> AttachFrame(name='0e4daa4f-abf5-41f6-a397-6caf80113aa6', handle=2, role=True, send_settle_mode=0, rcv_settle_mode=0, source=Source(address='$cbs', durable=None, expiry_policy=None, timeout=None, dynamic=None, dynamic_node_properties=None, distribution_mode=None, filters=None, default_outcome=None, outcomes=None, capabilities=None), target=Target(address='$cbs', durable=None, expiry_policy=None, timeout=None, dynamic=None, dynamic_node_properties=None, capabilities=None), unsettled=None, incomplete_unsettled=None, initial_delivery_count=None, max_message_size=None, offered_capabilities=None, desired_capabilities=None, properties=None)
2024-09-24 15:30:20,706 | INFO | MainThread | azure.servicebus._pyamqp.aio._link_async | Link state changed: <LinkState.DETACHED: 0> -> <LinkState.ATTACH_SENT: 1>
2024-09-24 15:30:20,706 | INFO | MainThread | azure.servicebus._pyamqp.aio._management_link_async | Management link receiver state changed: <LinkState.DETACHED: 0> -> <LinkState.ATTACH_SENT: 1>
2024-09-24 15:30:20,706 | DEBUG | MainThread | azure.servicebus._pyamqp.aio._link_async | -> AttachFrame(name='c91f4eda-758c-4cd5-86d6-c9c0c0a9e9ac', handle=1, role=False, send_settle_mode=0, rcv_settle_mode=0, source=Source(address='$cbs', durable=None, expiry_policy=None, timeout=None, dynamic=None, dynamic_node_properties=None, distribution_mode=None, filters=None, default_outcome=None, outcomes=None, capabilities=None), target=Target(address='$cbs', durable=None, expiry_policy=None, timeout=None, dynamic=None, dynamic_node_properties=None, capabilities=None), unsettled=None, incomplete_unsettled=None, initial_delivery_count=0, max_message_size=None, offered_capabilities=None, desired_capabilities=None, properties=None)
2024-09-24 15:30:20,707 | INFO | MainThread | azure.servicebus._pyamqp.aio._link_async | Link state changed: <LinkState.DETACHED: 0> -> <LinkState.ATTACH_SENT: 1>
2024-09-24 15:30:20,707 | INFO | MainThread | azure.servicebus._pyamqp.aio._management_link_async | Management link sender state changed: <LinkState.DETACHED: 0> -> <LinkState.ATTACH_SENT: 1>
2024-09-24 15:30:20,707 | DEBUG | MainThread | azure.servicebus._pyamqp.aio._connection_async | <- Header(b'AMQP\x00\x01\x00\x00')
2024-09-24 15:30:20,708 | INFO | MainThread | azure.servicebus._pyamqp.aio._connection_async | Connection state changed: <ConnectionState.OPEN_PIPE: 4> -> <ConnectionState.OPEN_SENT: 7>
2024-09-24 15:30:20,758 | DEBUG | MainThread | azure.servicebus._pyamqp.aio._connection_async | <- OpenFrame(container_id=b'be7bbb10e6164151aee6141b1ffcf95c_G34', hostname=None, max_frame_size=65536, channel_max=4999, idle_timeout=120000, outgoing_locales=None, incoming_locales=None, offered_capabilities=None, desired_capabilities=None, properties=None)
2024-09-24 15:30:20,759 | INFO | MainThread | azure.servicebus._pyamqp.aio._connection_async | Connection state changed: <ConnectionState.OPEN_SENT: 7> -> <ConnectionState.OPENED: 9>
2024-09-24 15:30:20,810 | DEBUG | MainThread | azure.servicebus._pyamqp.aio._session_async | <- BeginFrame(remote_channel=1, next_outgoing_id=1, incoming_window=5000, outgoing_window=65536, handle_max=255, offered_capabilities=None, desired_capabilities=None, properties=None)
2024-09-24 15:30:20,810 | INFO | MainThread | azure.servicebus._pyamqp.aio._session_async | Session state changed: <SessionState.BEGIN_SENT: 1> -> <SessionState.MAPPED: 3>

Steps to enable debug logging and frame level logging

import logging
import sys

handler = logging.StreamHandler(stream=sys.stdout)
logger = logging.getLogger('azure.servicebus')
logger.setLevel(logging.DEBUG)
logger.addHandler(handler)

...

from azure.servicebus import ServiceBusClient

client = ServiceBusClient(..., logging_enable=True)
github-actions[bot] commented 3 days ago

Hi @chandrakiranv. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

chandrakiranv commented 3 days ago

Hi @kashifkhan The reason we believe there is no issue with the body is we tested the same data with our old jobs and it processed the messages into ServiceBus topic without any issues. We have enabled the logging and looking into our Log Analytics and Dynatrace logs to see if anything is captured through the ServiceBus logger when the error happened.

github-actions[bot] commented 3 days ago

Hi @chandrakiranv. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.