Azure / azure-amqp

AMQP C# library
Other
94 stars 70 forks source link

Service Bus Session ID implementation in receiver link's `Attach` does not comply with AMQP 1.0 protocol #231

Closed minghuaw closed 1 year ago

minghuaw commented 1 year ago

Problem

The way a ServiceBusSessionReceiver is created in Azure.Messaging.ServiceBus sdk is not compliant with the AMQP 1.0 protocol. Currently, the way it is implemented in Azure.Messaging.ServiceBus is like following (Line 681)

// even if supplied sessionId is null, we need to add the Session filter if it is a session receiver
if (isSessionReceiver)
{
    filters.Add(AmqpClientConstants.SessionFilterName, sessionId);
}

var linkSettings = new AmqpLinkSettings
{
    Source = new Source { Address = endpoint.AbsolutePath, FilterSet = filters }, // Line 690
    // ... other setting
};

where filters is a FilterSet and sessionId is just a string. The sessionId is added as one entry to the filter-set field of the receiver link's Source

However, regarding the filter-set type, the AMQP 1.0 core spec states that (http://docs.oasis-open.org/amqp/core/v1.0/os/amqp-core-messaging-v1.0-os.html#type-filter-set)

A set of named filters. Every key in the map MUST be of type symbol, every value MUST be either null or of a described type which provides the archetype filter.

The FilterSet type is defined as a RestrictedMap<AmqpSymbol> that only restricts the type of key but leaves the value type as object. This leaves the room for the user to mistakenly use the library in a way that is not compliant with the protocol.

Related issue

Another issue has been opened in the Azure/azure-sdk-for-net repo. Azure/azure-sdk-for-net#32243

Further Investigation

To confirm that the string value in the filter-set is not actually serialized as a "described type" on the wire, I have created an minimal example with Azure/azure-amqp, where the receiver link is created similarly withthe filter-set field set to contain an entry of a string

// Only the link creation part is copied over
AmqpLinkSettings settings = new AmqpLinkSettings();
FilterSet filterSet = new FilterSet();
filterSet.Add("com.microsoft:session-filter", "test-filter");
settings.LinkName = $"receiver-{DateTime.UtcNow.Ticks}";
settings.Role = true;
settings.Source = new Source() { Address = "q1", FilterSet = filterSet };
settings.Target = new Target();
settings.TotalLinkCredit = 300;
ReceivingAmqpLink receiver = new ReceivingAmqpLink(session, settings);
await receiver.OpenAsync(TimeSpan.FromSeconds(10));

And I captured the bytes with wireshark and found out that it is NOT compliant to the AMQP protocol.

I am going to open an issue on Azure/azure-amqp, but for the sake of completeness, I will paste the captured bytes and explain why it is not compliant below

The captured bytes for the receiver link's Attach frame are:

0040   00 00 00 84 02 00 00 00 00 53 12 c0 77 0e a1 1b
0050   72 65 63 65 69 76 65 72 2d 36 33 38 30 33 30 36
0060   32 38 38 30 39 34 32 38 32 36 39 43 41 40 40 00
0070   53 28 c0 3c 0b a1 02 71 31 40 40 40 40 40 40 c1
0080   2c 02 a3 1c 63 6f 6d 2e 6d 69 63 72 6f 73 6f 66
0090   74 3a 73 65 73 73 69 6f 6e 2d 66 69 6c 74 65 72
00a0   a1 0b 74 65 73 74 2d 66 69 6c 74 65 72 40 40 40
00b0   00 53 29 c0 08 07 40 40 40 40 40 40 40 40 40 40
00c0   40 40 40 40

And the part that we are interested in is the FilterSet part of the Source field, which contains the following bytes

0070                                                c1
0080   2c 02 a3 1c 63 6f 6d 2e 6d 69 63 72 6f 73 6f 66
0090   74 3a 73 65 73 73 69 6f 6e 2d 66 69 6c 74 65 72
00a0   a1 0b 74 65 73 74 2d 66 69 6c 74 65 72

Breaking down the captured bytes, the FilterSet field is a map as indicated by the format code c1 at the beginning. And it contains only one key and one value.

A set of named filters. Every key in the map MUST be of type symbol, every value MUST be either null or of a described type which provides the archetype filter.

And on the wire, a described type should start with 0x00 and look like the Figure 1.2 (http://docs.oasis-open.org/amqp/core/v1.0/os/amqp-core-types-v1.0-os.html#doc-idp38080), which is pasted below. The value bytes pasted above clearly doesn't look anything close to a "described type"


            constructor                       untyped bytes
                 |                                 |
     +-----------+-----------+   +-----------------+-----------------+
     |                       |   |                                   |
...  0x00 0xA1 0x03 "URL" 0xA1   0x1E "http://example.org/hello-world"  ...
          |             |  |     |                                   |
          +------+------+  |     |                                   |
                 |         |     |                                   |
            descriptor     |     +------------------+----------------+
                           |                        |
                           |         string value encoded according
                           |           to the str8-utf8 encoding
                           |
                  primitive format code
                for the str8-utf8 encoding

     (Note: this example shows a string-typed descriptor, which is
      considered reserved)
jsquire commented 1 year ago

@xinchen10, @vinaysurya :

The client pattern appears in the Microsoft.Azure.ServiceBus receiver and carried forward to the current generation. I don't have access to confirm what was used in the T0 package, but I suspect that it also builds its filter set the same way.

My assumption from the historical client patterns is that AmqpCodec held responsibility for encoding the map and is translating to symbol/described as needed for spec compliance. However, based on @minghuaw's findings, that does not appear to be the case.

We know that the session filter is being applied and the client features work correctly, so it does not appear that we have a functional problem. However, I'm not sure if:

Would appreciate thoughts from Xin and Vinay on whether the client pattern should be adjusted, or you'd advise leaving as-is.

//fyi: @JoshLove-msft

xinchen10 commented 1 year ago

@minghuaw Please use the following to send a session-filter to the SB service.

<type name="com.microsoft:session-filter" class="restricted" source="string" provides="filter">
    <descriptor name="com.microsoft:session-filter" code="0x00000137:000000C"/>
</type>

The service supports a described string as the session-id. The plain string is a legacy from the AMQP protocol development time. The SDK started with that and was not updated later.

It would be nice to require or enforce described types in FilterSet but we will consider that in the future. For now the user is responsible for putting a correct type in the set.

minghuaw commented 1 year ago

Thanks a lot!

minghuaw commented 1 year ago

@xinchen10 FYI, it seems like only the descriptor name works for now. An error (shown below) is returned if the descriptor code is sent

invalid filter type. System only support Jms or Apache selector filter type but we found type 'Microsoft.ServiceBus.Messaging.Amqp.Encoding.DescribedType' associated with key 'com.microsoft:session-filter'

xinchen10 commented 1 year ago

@minghuaw is the descriptor code encoded as ulong?

minghuaw commented 1 year ago

@xinchen10 Thanks for your quick response.

@minghuaw is the descriptor code encoded as ulong?

It is. I will paste the serialized bytes of a SessionFilter("test".to_string()) below.

  1. With description code, which was rejected
0x0, 0x80, 0x0, 0x0, 0x1, 0x37, 0x0, 0x0, 0x0, 0xc, 0xa1, 0x4, 0x74, 0x65, 0x73, 0x74 
  1. With description name, which works fine
0x0, 0xa3, 0x1c, 0x63, 0x6f, 0x6d, 0x2e, 0x6d, 0x69, 0x63, 0x72, 0x6f, 0x73, 0x6f, 
0x66, 0x74, 0x3a, 0x73, 0x65, 0x73, 0x73, 0x69, 0x6f, 0x6e, 0x2d, 0x66, 0x69, 0x6c, 
0x74, 0x65, 0x72, 0xa1, 0x4, 0x74, 0x65, 0x73, 0x74
minghuaw commented 1 year ago

I found the problem. The provided descriptor code is not strictly compliant to the spec.

0x00000137:000000C

is actually interpreted as 0x0000_0013_7000_000C. However, this is not strictly compliant to the protocol, according to the protocol (http://docs.oasis-open.org/amqp/core/v1.0/os/amqp-core-types-v1.0-os.html#section-descriptor-values),

numeric descriptors: (domain-id << 32) | descriptor-id

A strictly-compliant descriptor code should be 0x0000_0137 << 32 | 0x000_000c, but the actual descriptor code is 0x0000_0013 << 32 | 0x7000_000c.

I guess this is fine as the types won't be used outside Service Bus or other microsoft products. I am re-opening this issue simply to see whether this will be changed in the service itself to be strictly compliant or not. Thanks in advance :)

xinchen10 commented 1 year ago

I am pretty sure the ulong code was intended to be 0x0000_0137 << 32 | 0x000_000c and the current working one is probably a typo. It is not easy to fix the existing clients but I will notify the service team to add support for the correct codes on the service side.