slimta / python-slimta-cloudstorage

Adds a queue storage backend interfacing with common service providers.
MIT License
1 stars 0 forks source link

TypeError and some doubts about cloudstorage #1

Closed rafaelnovello closed 10 years ago

rafaelnovello commented 10 years ago

Hi Ian!

First of all, slimta cloudstorage is awesome! It will help me a lot!

It works correctly, but some times I got the following error:

Traceback (most recent call last):
  File "/home/rafael/prj/mailerweb/disparador/local/lib/python2.7/site-packages/gevent/greenlet.py", line 390, in run
    result = self._run(*self.args, **self.kwargs)
  File "/home/rafael/prj/mailerweb/disparador/local/lib/python2.7/site-packages/slimta/queue/__init__.py", line 283, in _load_all
    for entry in self.store.load():
  File "/home/rafael/prj/mailerweb/disparador/local/lib/python2.7/site-packages/slimta/cloudstorage/aws.py", line 146, in list_messages
    timestamp, attempts = self.get_message_meta(id)
  File "/home/rafael/prj/mailerweb/disparador/local/lib/python2.7/site-packages/slimta/cloudstorage/aws.py", line 138, in get_message_meta
    timestamp = json.loads(timestamp_raw)
  File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.7/json/decoder.py", line 365, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
TypeError: expected string or buffer
<Greenlet at 0x1a4bb90: <bound method Queue._load_all of <Queue at 0x1a4bc30>>> failed with TypeError

I also would like to ask a few things:

  1. cloudstorage takes care of deleting messages of S3 bucket?
  2. I'm running slimta on many EC2 spot servers and I would like to ensure the delivery of messages even if some server power off. Can I trust on CloudStorage for that?

Thank you so much!

icgood commented 10 years ago

Hey Rafael, I'm impressed how quickly you've gotten on that new extension!

I just pushed out a new version that I'm hoping corrects the issue you were seeing. It seemed as though object metadata did not get written correctly when written after message contents. I'm not sure why, I'm a Rackspace developer not Amazon :smiley:. Odd though, I thought I tested that successfully beforehand.

As for your other questions:

Regarding question 1: Yes, once a message has been successfully relayed or failed, it is removed from the S3 bucket.

Regarding question 2: Good question. It's a brand new extension. What I'll do is give you the big concern I have... I'm unsure about multiple, simultaneous instances of slimta connected to the same S3 bucket. If an instance goes down, you can start a new instance connected to the same bucket and it will pick up where the previous one left off. But attempting to run two concurrent instances on one bucket will likely cause strange problems such as duplicate deliveries.

rafaelnovello commented 10 years ago

Hi Ian! Thanks for replying so quickly!

I'll test the new version and report to you!

About my questions:

  1. Apparently the messages are not being deleted from S3. May I have done something wrong?
  2. I don't know much about S3 but control of concurrence is not the goal of this service, SQS exists for it. In the documentation you talk about SQS if the reception of messages is separated from the relaying. How can I separate these two? I believe that in this way, using SQS, there would be no problems of concurrence. What you think about it?

Ian, thanks so much for the help!

icgood commented 10 years ago

Cool, definitely check out the new version. It should resolve your first question, which I believe was related to the TypeError exception you were seeing.

I think I see what you're saying now. Let me clarify, there is only one case I believe messages may be delivered twice. Consider the following:

  1. One thousand messages are received into the queue and written to S3.
  2. Two separate slimta processes (P1 and P2) are listening on the SQS queue, each receive 500 messages.
  3. P1 is killed and restarted.
  4. P1 scans the S3 bucket, it now thinks it "owns" all one thousand messages.
  5. P1 delivers all one thousand messages.
  6. P2 delivers all of its 500 messages.

See the problem? The slimta processes will scan the S3 bucket on startup, as a recovery mechanism. I can think of a couple of solutions:

Of course, all of this is only a problem if you want to have more than one slimta process running against the same bucket. Using separate buckets for each process, or using different prefixes for each process, will negate this issue and allow slimta queues to function properly even if they are killed and restarted.

I should have some more time to think about this next week. Let me know what you see with the new version!

Ian

On Fri, Nov 15, 2013 at 9:22 AM, Rafael Novello notifications@github.comwrote:

Hi Ian! Thanks for replying so quickly!

I'll test the new version and report to you!

About my questions:

1.

Apparently the messages are not being deleted from S3. May I have done something wrong? 2.

I don't know much about S3 but control of concurrence is not the goal of this service, SQS exists for it. In the documentation you talk about SQS if the reception of messages is separated from the relaying. How can I separate these two? I believe that in this way, using SQS, there would be no problems of concurrence. What you think about it?

Ian, thanks so much for the help!

— Reply to this email directly or view it on GitHubhttps://github.com/slimta/python-slimta-cloudstorage/issues/1#issuecomment-28571902 .

rafaelnovello commented 10 years ago

Hi Ian!

Thanks for the great explanation! Now I understood how it works!

I think it would be easier to have different buckets for each Slimta process. I'll think more about it too.

I'll test the new version soon and report to you!

Thanks!!

rafaelnovello commented 10 years ago

Ian, sorry for the delay.

I tested the new version and worked fine! But I'll try a simpler approach with only the slimta core and probably I will make a few question to you by email, okay?