using stomp.send_frame fails after a couple of times

ghost commented 8 years ago

After 12 xmits using stomp.send_frame(), I get a stomp.connection.NotConnectedException. If I send using "stomp.send()", it runs forever. It could be I'm using stomp wrong, but it does seem pretty straightforward. I'm using stomp.py 4.1.8, ActiveMQ 5.12.1 and python python 2.7.5 on a RHEL 7 platform.

#!/usr/bin/python
import random
import stomp
import socket
from time import sleep

if __name__ == '__main__':

   hdrs = {'expires': '0', 'timestamp': '1452527028845', 'destination': '/topic/FOO', 'persistent': 'true', 'priority': '4', 'message-id': 'ID:520969-foobar-52253-1451938805275-0:0:2:392022:1', 'subscription': '3'}

   nbody = """
<?xml version="1.0" encoding="UTF-8"?>
<table xmlns="http://query.yahooapis.com/v1/schema/table.xsd" https="false">
    <meta>
        <author>Brian Cantoni</author>
        <description>Generate random Lorem Ipsum text from lipsum.com. Parameters: amount=number of things to fetch; what: paras, words, bytes, or lists; start: set to "no" to not start all strings with Lorem Ipsum.</description>
        <documentationURL>http://www.lipsum.com/</documentationURL>
        <sampleQuery>select * from {table} where amount='5' and what='paras';</sampleQuery>
    </meta>
    <bindings>
        <select itemPath="" produces="XML">
            <urls>
                <url></url>
            </urls>
            <inputs>
                <key id="%f" type="xs:string" paramType="query" />
                <key id="%f" type="xs:string" paramType="query" />
                <key id="%f" type="xs:string" paramType="query" default="yes" />
            </inputs>
            <execute>
                <![CDATA[
                var q = y.query('select * from html where url="http://www.lipsum.com/feed/http?amount=' + amount + '&what=' + what + '&start=' + start + '" and xpath=\'//div[@id="lipsum"]//p\'');
                if (q.results) {
                    response.object = q.results;
                }
                ]]>
            </execute>
        </select>
    </bindings>
</table>
"""

   a = 0.72
   b = -0.95
   c = 0.01
   count = 0
   addr = ('localhost',61613)
   conx = stomp.Connection([addr])
   conx.start()
   conx.connect(wait=True)

   while True:
      try:
         # this fails after 12 reps:
         conx.send_frame(cmd="SEND",body=nbody % (a,b,c), headers=hdrs)

         # This runs for a long time
         # conx.send(body=nbody % (a,b,c), destination='/topic/FOO')
      except socket.error, e:
         print 'socket error: %s' % e.strerror
         sleep(0.5)
         conx.stop()
         conx.disconnect()
         sleep(0.5)
         conx = stomp.Connection([addr])
         conx.start()
         conx.connect(wait=True)
         continue
      except stomp.exception.NotConnectedException, e:
         print 'stomp error: % s' % e
         sleep(1)
         conx = stomp.Connection([addr])
         conx.start()
         conx.connect(wait=True)
         continue
      # conx.disconnect()
      print 'sent %d' % count
      count += 1
      sleep(1)

jasonrbriggs commented 8 years ago

Have you checked the Activemq logs? When I try your code on Apollo, I see the following output (after the 12th send):

2015-08-28 07:15:00,223 Shutting connection '/192.168.1.90:50614'  down due to: java.net.ProtocolException: The maximum header length was exceeded

Probably something to do with using the same message-id over and over again. If I amend your code, and add the counter to the message-id, it seems to work okay.

ghost commented 8 years ago

Thank you. It's queer: I resorted to using "send_frame" because the JMS agent(3rd party) I'm working with would not accept the stomp.send()message due to the first header field being "content-length". I had no idea that JMS/ActiveMQ and agents were so particular/sensitive to header fields. I'll check the logs, and try updating the message-id: That's a veryoblique error message you got!

Also, sorry to have posted this as an issue: I had tried sending youan email @ mail@jasonbriggs.com but it bounced. Thanks for your suggestions/help. -thor-

  From: jasonrbriggs <notifications@github.com>

To: jasonrbriggs/stomp.py stomp.py@noreply.github.com Cc: uncle4 thorfarrish@yahoo.com Sent: Tuesday, January 12, 2016 6:20 PM Subject: Re: [stomp.py] using stomp.send_frame fails after a couple of times (#82)

Have you checked the Activemq logs? When I try your code on Apollo, I see the following output (after the 12th send):2015-08-28 07:15:00,223 Shutting connection '/192.168.1.90:50614' down due to: java.net.ProtocolException: The maximum header length was exceeded Probably something to do with using the same message-id over and over again. If I amend your code, and add the counter to the message-id, it seems to work okay.— Reply to this email directly or view it on GitHub.

ghost commented 8 years ago

It looks like the call to "stomp.send_frame()" modifies the headersdictionary that's passed in. Here are a couple of runs displayingthe headers dictionary after each run: {'expires': '0', 'timestamp': '1452527028845', 'destination': '/topic/FOO', 'persistent': 'true', 'priority': '4', 'message-id': 'ID:520969-foo-52253-1451938805275-0:0:2:392022:1', 'subscription': '3'} sent 0 {'expires': '0', 'timestamp': '1452527028845', 'destination': '/topic/FOO', 'persistent': 'true', 'priority': '4', 'message-id': 'ID\c520969-foo-52253-1451938805275-0\c0\c2\c392022\c1', 'subscription': '3'} sent 1 {'expires': '0', 'timestamp': '1452527028845', 'destination': '/topic/FOO', 'persistent': 'true', 'priority': '4', 'message-id': 'ID\c520969-foo-52253-1451938805275-0\c0\c2\c392022\c1', 'subscription': '3'} sent 2 {'expires': '0', 'timestamp': '1452527028845', 'destination': '/topic/FOO', 'persistent': 'true', 'priority': '4', 'message-id': 'ID\\c520969-foo-52253-1451938805275-0\\c0\\c2\\c392022\\c1', 'subscription': '3'} sent 3 {'expires': '0', 'timestamp': '1452527028845', 'destination': '/topic/FOO', 'persistent': 'true', 'priority': '4', 'message-id': 'ID\\\\c520969-foo-52253-1451938805275-0\\\\c0\\\\c2\\\\c392022\\\\c1', 'subscription': '3'} sent 4 {'expires': '0', 'timestamp': '1452527028845', 'destination': '/topic/FOO', 'persistent': 'true', 'priority': '4', 'message-id': 'ID\\\\\\\\c520969-foo-52253-1451938805275-0\\\\\\\\c0\\\\\\\\c2\\\\\\\\c392022\\\\\\\\c1', 'subscription': '3'} sent 5 {'expires': '0', 'timestamp': '1452527028845', 'destination': '/topic/FOO', 'persistent': 'true', 'priority': '4', 'message-id': 'ID\\\\\\\\\\\\\\\\c520969-foo-52253-1451938805275-0\\\\\\\\\\\\\\\\c0\\\\\\\\\\\\\\\\c2\\\\\\\\\\\\\\\\c392022\\\\\\\\\\\\\\\\c1', 'subscription': '3'} Somewhere stomp is trying to escape the colons, and subsequent runs causesthe backslashes to be escaped themselves: the header explodes in size. From: jasonrbriggs notifications@github.com To: jasonrbriggs/stomp.py stomp.py@noreply.github.com Cc: uncle4 thorfarrish@yahoo.com Sent: Tuesday, January 12, 2016 6:20 PM Subject: Re: [stomp.py] using stomp.send_frame fails after a couple of times (#82)

Have you checked the Activemq logs? When I try your code on Apollo, I see the following output (after the 12th send):2015-08-28 07:15:00,223 Shutting connection '/192.168.1.90:50614' down due to: java.net.ProtocolException: The maximum header length was exceeded Probably something to do with using the same message-id over and over again. If I amend your code, and add the counter to the message-id, it seems to work okay.— Reply to this email directly or view it on GitHub.

jasonrbriggs commented 8 years ago

That's definitely a bug. For the moment if you recreate your headers dict every time, that should fix the problem. But I'll put a bug fix in for this shortly.

(BTW, your mail bounced because you missed an 'R' in the domain name....)

ghost commented 8 years ago

I added "import copy" and did a "theaders = copy.deepcopy(headers)"using theaders when calling escape_headers and utils.Frame: letme know if you want a diff.

Also, I've taken the colons out of my message ID to doublyprotect. Tks!

  From: jasonrbriggs <notifications@github.com>

To: jasonrbriggs/stomp.py stomp.py@noreply.github.com Cc: uncle4 thorfarrish@yahoo.com Sent: Wednesday, January 13, 2016 7:00 PM Subject: Re: [stomp.py] using stomp.send_frame fails after a couple of times (#82)

That's definitely a bug. For the moment if you recreate your headers dict every time, that should fix the problem. But I'll put a bug fix in for this shortly.(BTW, your mail bounced because you missed an 'R' in the domain name....)— Reply to this email directly or view it on GitHub.

scop commented 8 years ago

I don't think this is the correct fix for the problem at hand. Trying to encode only once when asked to encode multiple times simply sounds wrong, and besides, I don't think it can be made to work properly with the current implementation, that's a battle that cannot be won. It has already in its current git master form caused a regression, for example this on a 1.2 connection:

 conn.send("/queue/test", "test", headers={"test": r'\c\n'})

...ends up sending test:\c\n on the wire which is incorrect; both backslashes should be doubled.

Apart from modifying the passed in headers, I think the previous code worked just fine. In any case, I suggest reverting the changes that attempt to avoid encoding when already encoded, and instead either: 1) Simply documenting the fact that send_frame() modifies the passed in headers, or 2) Making either send_frame() or utils.Frame() copy the headers and operate on the copy

By the way, all internal callers of send_frame() in stomp.py seem to pass it a copy of the headers they receive, copied through utils.merge_headers(). My vote would go to alternative 1) above in order to avoid some header copies in the vast majority of cases that do not need it. send_frame() is not documented to be an API method by the way, so maybe it should be explicitly made private while at it (both in docs and by adding an underscore prefix).

scop commented 8 years ago

@uncle4 Note also that you can get rid of the automatically added content-length by specifying suppress_content_length=True with stomp.py 4.1.9

jasonrbriggs commented 8 years ago

conn.send("/queue/test", "test", headers={"test": r'\c\n'}) ends up sending test:\c\n on the wire which is incorrect; both backslashes should be doubled.

That's not what I see in unit testing. I see this output...

pass 1 {'test': '\\c\\n'}
pass 2 {'test': '\\c\\n'}
pass 3 {'test': '\\c\\n'}

So, at least to me, it seems to be working, and seems idempotent... (admittedly I haven't checked the actual send itself).

Agreed that send_frame should probably be explicitly private though...

ghost commented 8 years ago

Thanks! Where do I specify that "suppress.." parameter? I noticed that the stomp specification says that a message that HASa "content-length" header parameter is sent as a BytesMessage andif there's NO "content-length" header field, the message is sent asa TextMessage. Is there a way to change this and "manually"specify if a message should be "ByteMessage" or "CharMessage"? (and if so, where's the documentation so I don't have to ask sillyquestions to busy people like you!) (and where is the documentationfor the "suppress_content_length" parameter?)

Thanks for the library and support!

  From: Ville Skyttä <notifications@github.com>

To: jasonrbriggs/stomp.py stomp.py@noreply.github.com Cc: uncle4 thorfarrish@yahoo.com Sent: Friday, January 15, 2016 2:49 PM Subject: Re: [stomp.py] using stomp.send_frame fails after a couple of times (#82)

@uncle4 Note also that you can get rid of the automatically added content-length by specifying suppress_content_length=True with stomp.py 4.1.9— Reply to this email directly or view it on GitHub.

jasonrbriggs commented 8 years ago

Not currently documented. You can pass auto_content_length = False on the Connection constructor to suppress.

ghost commented 8 years ago

Making "send_frame" private is fine. But without documentinga way to stifle the "Content-Length" header field, userswon't be able to select messages of type "BytesMessage"or "TextMessage" (unless there's more undocumentedfunctionality out there. I didn't see it in the sources: but Ionly glanced through 'em). Thanks!

  From: jasonrbriggs <notifications@github.com>

To: jasonrbriggs/stomp.py stomp.py@noreply.github.com Cc: uncle4 thorfarrish@yahoo.com Sent: Friday, January 15, 2016 6:22 PM Subject: Re: [stomp.py] using stomp.send_frame fails after a couple of times (#82)

Not currently documented. You can pass auto_content_length = False on the Connection constructor to suppress.— Reply to this email directly or view it on GitHub.

jasonrbriggs commented 8 years ago

Not sure what you mean by "BytesMessage" versus "TextMessage". I don't recall anything from the spec which details that. Content-length can be used on all types of stomp frames -- certainly not a mechanism for controlling text vs binary...

ghost commented 8 years ago

From the Apache/ActiveMQ web page:

Working with JMS Text/Bytes Messages and Stomp Stomp is a very simple protocol - that's part of the beauty of it! As such, it does not have knowledge of JMS messages such as TextMessages or BytesMessages. The protocol does however support a content-length header. To provide more robust interaction between Stomp and JMS clients, ActiveMQ keys off of the inclusion of this header to determine what message type to create when sending from Stomp to JMS. The logic is simple:

This same logic can be followed when going from JMS to Stomp, as well. A Stomp client could be written to key off of the inclusion of the content-length header to determine what type of message structure to provide to the user.

  From: jasonrbriggs <notifications@github.com>

To: jasonrbriggs/stomp.py stomp.py@noreply.github.com Cc: uncle4 thorfarrish@yahoo.com Sent: Saturday, January 16, 2016 9:39 AM Subject: Re: [stomp.py] using stomp.send_frame fails after a couple of times (#82)

Not sure what you mean by "BytesMessage" versus "TextMessage". I don't recall anything from the spec which details that. Content-length can be used on all types of stomp frames -- certainly not a mechanism for controlling text vs binary...— Reply to this email directly or view it on GitHub.

jasonrbriggs commented 8 years ago

Well that explains why I haven't seen it then -- it's not part of the stomp spec.

Documentation has been updated now, btw -- http://jasonrbriggs.github.io/stomp.py/api.html#establishing-a-connection

scop commented 8 years ago

I did check the actual send with Wireshark. To make sure, I just re-tested and what I said still holds. 4.1.9 does not have this problem.

It's a bit hard to speculate on your unit test output because a unit test producing output like your comment above doesn't seem to exist in the tree. But looks like it is actually showing the problem. \\c in your output is probably actually two characters, \ and c, not a double backslash followed by a c. Ditto the \\n case:

$ python3
Python 3.4.3 (default, Jun 29 2015, 12:16:01) 
[GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> r'\c'
'\\c'

scop commented 8 years ago

More to-the-point "proof" for my guess about your unit test case:

$ python3
Python 3.4.3 (default, Jun 29 2015, 12:16:01) 
[GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> print({'test': r'\c\n'})
{'test': '\\c\\n'}

scop commented 8 years ago

Also, FWIW, in my opinion the ActiveMQ behavior is just bad. I'd say choosing between bytes/text based on existence of content-length (which is a SHOULD for all SEND/MESSAGE/ERROR frames that have a payload in the STOMP spec, and a MUST if it contains null octets) is a fragile heuristic on a header that's not intended for that purpose, and nothing that helps with "robust interaction". STOMP 1.1+ has the content-type header which is intended for specifying type of the payload, and it even describes how receivers SHOULD treat frames without content-type as binary blobs (~ bytesmessage...)

jasonrbriggs commented 8 years ago

Really wish github had properly threaded discussions sometimes. So just to make it more clear, replying to your comments in turn:

1) I'll add another test and see if I can prove it out. I'm not sure \c is a good example to look at, because that's not part of the standard C escape character set. So \c doesn't get escaped and you'll always see \c (that's actually one backslash, followed by a C character) when you print it. As opposed to \n where the behaviour would be different if you print a raw string r'\n' versus a normal string '\n'.

2) Here's the unit test (haven't pushed yet). I've tested both raw and normal strings:

    def test_pre_encoded1(self):
        hdrs = {"test": '\c\n'}

        self.protocol._escape_headers(hdrs)

        self.assertEqual('\\c\\n', hdrs['test'])

        print('pass 1 %s' % hdrs)
        self.protocol._escape_headers(hdrs)
        print('pass 2 %s' % hdrs)
        self.protocol._escape_headers(hdrs)
        print('pass 3 %s' % hdrs)

        print(hdrs)

    def test_pre_encoded2(self):
        hdrs = {"test": r'\c\n'}

        self.protocol._escape_headers(hdrs)

        self.assertEqual('\\c\\n', hdrs['test'])

        print('pass 1 %s' % hdrs)
        self.protocol._escape_headers(hdrs)
        print('pass 2 %s' % hdrs)
        self.protocol._escape_headers(hdrs)
        print('pass 3 %s' % hdrs)

        print(hers)

3) Agreed about the ActiveMQ behaviour. Awful design decision. I assume it's some stomp1.0 legacy thing that's never been fixed, but given that even in stomp1.0 you could send arbitrary headers, I'm still not sure why anyone would come to the decision that overloading content-length behaviour was a good idea.

scop commented 8 years ago

I think both your assertEquals are incorrect. They should be testing that hdrs['test'] contains the sequence "backslash" "backslash" "c" "backslash" "backslash" "n", i.e. r'\\c\\n', not \\c\\n. The latter is "backslash" "c" "backslash" "n" which is the original string and thus no escaping was done by _escape_headers compared to the input (it was left as is), which is the bug introduced by this change I'm trying to communicate...

scop commented 8 years ago

Aaargh. Actually assertEquals in test_pre_encoded1 should test for r'\\c\n' ("backslash" "backslash" "c" "backslash" "n"), and assertEquals in test_pre_encoded2 should test for r'\\c\\n' ("backslash" "backslash" "c" "backslash" "backslash" "n").

jasonrbriggs commented 8 years ago

Perhaps we should use the same convention as the stomp spec because escaping rules are an absolute pain (to discuss). I don't believe r'\\c\n' is correct:

>>> r'\\c\n'
'\\\\c\\n'
>>> len(r'\\c\n')
5

That looks wrong to me. Consider my original header:

hdrs = {"test": '\c\n'}

That's 3 chars (octet 92, octet 99, octet 10)

Run through the escaping that should convert to octet 92, octet 99, octet 92, octet 110 (a backslash, a c, a backslash, an n).

Adding this to my unit-test:

for c in hdrs['test']:
            print(ord(c))

I get exactly that:

{'test': '\\c\\n'}
92
99
92
110

scop commented 8 years ago

Huh, len(r'\c\n') == 5? It is always 4 here. But maybe you meant len(r'\\c\n') == 5?

Anyway, per STOMP 1.2 rules, I don't think (92, 99, 10) should escape to (92, 99, 92, 110), because when unescaping that back, it becomes (58, 10): 92+99 transforms to 58, and 92+110 to 10. So an encode-decode roundtrip would turn r'\c\n' into ':\n'.

Instead, (92, 99, 10) should escape to (92, 92, 99, 92, 110) which unescapes back to the original.

I think there's a bit of a bug in the STOMP spec's value encoding chapter: it says carriage return, line feed, or colon should be escaped, but backslash should also be in that list.

Also, I now also believe that stomp.utils.parse_headers() is broken: decoding a value should be done in one pass, instead of running the replacement operations all over the entire string in succession. (It also doesn't pay proper attention to the protocol version in use. Neither do the places where PREAMBLE_END_RE and LINE_END_RE are used, but that's another issue.) I'll open separate PRs or issues for these later, let's not delve into them in this one :)

jasonrbriggs commented 8 years ago

I think I'll back out my changes for the moment. And agreed, the escaping of \c\n (as per my description) is wrong considering the reverse (un-escaping) would then not result in the same output. Still of 2 minds regarding escaping idempotency, but when I think about it, I don't think it matters a lot, as long as I stick in some documentation that describes the behaviour.

jasonrbriggs commented 7 years ago

Closing as I don't think this is something that I'll ever get around to fixing. Unless someone submits a PR because it bugs them enough, this falls into the "won't fix" bucket...

jasonrbriggs / stomp.py

using stomp.send_frame fails after a couple of times #82