Closed xaratt closed 11 years ago
@xaratt this is really odd.
what happens when you try to connect to the AWS cache servers from "my.proxy.server" directly. So, in one try you use
printf "get foo\r\n" | nc myserver.0001.use1.cache.amazonaws.com 11211
And in the next try you do
printf "get foo\r\n" | nc ec2-xx-xx-xx-xx.compute-1.amazonaws.com 11211
do both of them work for you?
Also does ping of "ec2-xx-xx-xx-xx.compute-1.amazonaws.com" and " myserver.0001.use1.cache.amazonaws.com" resolve to different addresses?
Thank you for response.
Yes, when I try to connect directly to cache servers, both of them works fine:
user@my.proxy.server:~$ printf "get foo\r\n" | nc myserver.0001.use1.cache.amazonaws.com 11211 END user@my.proxy.server:~$ printf "get foo\r\n" | nc ec2-xx-xx-xx-xx.compute-1.amazonaws.com 11211 END
And ping of both server names resolve to the same address. I can give you real IP of "ec2-xx-xx-xx-xx", but Amazon's FAQ say that "Currently, all clients to an ElastiCache Cluster must be within the Amazon EC2 network".
I want to try to run twemproxy with memcached on my non-AWS server using CNAME and IP for connection. I'll post results here.
I made few new attempts and found configuration which allow me use twemproxy with AWS Elasticaches. I created own CNAMEs which points on Amazon's myserver.000x.use1.cache.amazonaws.com servers and twemproxy works fine with this strange scheme:
cache1.example.com -> myserver.0001.use1.cache.amazonaws.com -> ec2-xx-xx-xx-xx.compute-1.amazonaws.com
Is this bug (feature?) in Amazon DNS system? I don't know.
Xaratt did you ever get to the bottom of this?
We recently saw a similar-sounding issue (twemproxy appears to connect but then commands result in "ERR Connection timed out" with an Elasticache Redis write endpoint. It was intermittent but I will attempt to gather more info if it happens again.
@tom-dalton-fanduel, sorry for delay in responding, but we didn't found source or solution for this problem. We only added CNAMEs for each of our memcache nodes.
No problem - it looks like this was unrelated to an issue we were looking at!
Confused. Isn't EC Cluster endpoint a proxy, like twemproxy? Why deal with two proxies, rather set your app to connect directly to the EC Cluster endpoint.
Twemproxy is more than just a proxy, it provides transparent sharding too. In my case (and I'm guessing @xaratt 's too?) twemproxy is used to shard across multiple EC [write] endpoints.
@tom-dalton-fanduel I am having similar issue as you. Did you find out what the problem was. Thanks
I'm afraid I don't even remember the context for my comment, let alone if we ever solved it. We've since moved away from Twemproxy.
Same... we ended up using aws elasticache instead.
On Wed, Jul 10, 2019 at 9:55 AM Tom Dalton notifications@github.com wrote:
I'm afraid I don't even remember the context for my comment, let alone if we ever solved it. We've since moved away from Twemproxy.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/twitter/twemproxy/issues/18?email_source=notifications&email_token=AABZJXQ3QQGRB6ZTI22PFQ3P6YH6ZA5CNFSM4ABZPRQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZUCUFQ#issuecomment-510142998, or mute the thread https://github.com/notifications/unsubscribe-auth/AABZJXTLMDNE3MHZDHFGCE3P6YH6ZANCNFSM4ABZPRQA .
We use AWS elasticache as well but intermittently we get "ERR Connection timed out" Could not see anything in the logs.
Did u try installing the aws specific module? It’s supposed to support ejections
On Wednesday, July 10, 2019, smehtaCAS notifications@github.com wrote:
We use AWS elasticache as well but intermittently we get "ERR Connection timed out" Could not see anything in the logs.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/twitter/twemproxy/issues/18?email_source=notifications&email_token=AABZJXUQZ3JFXQD76KTHAVLP62TCTA5CNFSM4ABZPRQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZVMYYY#issuecomment-510315619, or mute the thread https://github.com/notifications/unsubscribe-auth/AABZJXS4XTNYWG2GRZM7MLDP62TCTANCNFSM4ABZPRQA .
I have not. Can you point me to it. Thanks!
I can't remember specifically, it's been a few years, but a google search should find something. Or start here: https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/Appendix.PHPAutoDiscoverySetup.html
On Wed, Jul 10, 2019 at 8:52 PM smehtaCAS notifications@github.com wrote:
I have not. Can you point me to it. Thanks!
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/twitter/twemproxy/issues/18?email_source=notifications&email_token=AABZJXWLZIE6YLS6OBP5UETP62U7VA5CNFSM4ABZPRQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZVNPCQ#issuecomment-510318474, or mute the thread https://github.com/notifications/unsubscribe-auth/AABZJXSQB3BQ2FAXVMYAEFLP62U7VANCNFSM4ABZPRQA .
I wonder if https://github.com/twitter/twemproxy/pull/567 was related - until that's merged, ketama_max_hostlen is 86. (if it works for short names but not long names)
But I don't see how that'd possibly be the issue, it'd just hash requests to the wrong host. (snprintf still appends null characters)
(looking at this issue while looking into whether timeouts are more likely with elasticache in general - doesn't seem like it)
Leaving a note on this to refer back to later in case anyone else has issues with elasticache memcached - the issue I'm looking into is unrelated
https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/ParameterGroups.Memcached.html suggests there's no timeout, so I'm confused
Elasticache itself doesn't have an idle timeout according to recent documentation for 1.4, not sure if old versions were different
idle_timeout | Default: 0 (disabled)Type: integerModifiable: YesChanges Take Effect: At Launch | The minimum number of seconds a client will be allowed to idle before being asked to close. Range of values: 0 to 86400. |
---|
I think https://github.com/twitter/twemproxy/pull/324/files#diff-01600ca8f8e542768f785de1842f38b3aeeb315531c63b9d2ce8730a21f72a80 may help (related to the redis sentinel support proposal), but I still get occasional timeouts to elasticache when there's low traffic anyway
Found a strange bug when was trying to run twemproxy with cluster of elasticache (Amazon cloud memcached) servers. Amazon use CNAMEs as entry points for elasticache servers and twemproxy could connect to the backend memcached on start, but couldn't send any request to them. If I use "direct" hostnames for the backend servers, all requests are ok.
twemproxy config:
twemproxy was running as
Here is part of twemproxy log: http://pastebin.com/DTE8gAva
When I modified servers section:
I received response:
And, of course, *.cache.amazonaws.com could be resolved from instance where twemproxy is running:
P.S. Oct 26 code snapshot was used; Ubuntu 12.04.1 x86_64