Closed GoogleCodeExporter closed 9 years ago
Added: Node 192.168.1.10 still working but not getting any load on it.
When I call: http://192.168.1.10:8091/pools, I get back:
{"pools":[{"name":"default","uri":"/pools/default","streamingUri":"/poolsStreami
ng/default"}],"isAdminCreds":false,"uuid":"581cefce-bea5-4001-15e8-ad67000000ea"
,"implementationVersion":"1.7.0","componentsVersion":{"os_mon":"2.2.5","mnesia":
"4.4.17","inets":"5.5.2","kernel":"2.14.3","sasl":"2.1.9.3","ns_server":"1.7.0",
"stdlib":"1.17.3"}}
Original comment by bouri...@gmail.com
on 16 Jun 2011 at 6:06
My issue is simular to Issue #108 and Issue #180
Original comment by bouri...@gmail.com
on 16 Jun 2011 at 6:16
The way Membase works, the failure of a node will cause errors at the client
level. Are you saying you cannot get any operations through to any other node
of the cluster?
Do things recover when you hit the "failover" button in Membase's web UI?
If you believe there is an issue here, please post a test that demonstrates
what you think is wrong.
Original comment by ingen...@gmail.com
on 23 Jun 2011 at 5:45
"Failover" in membase UI works as expected. I perfectly handle it on the client
side.
Manual shut down of node via terminal ($kill something or $sudo
/etc/init.d/membase-server stop) crushes the client.
My cluster setup is describe in the issue body above. To test failure you can
run code similar to this one:
import net.spy.memcached.AddrUtil;
import net.spy.memcached.BinaryConnectionFactory;
import net.spy.memcached.MemcachedClient;
import java.io.IOException;
public class memcache_test2 {
public static void main(String[] args) throws IOException {
MemcachedClient c = new MemcachedClient(new BinaryConnectionFactory(), AddrUtil.getAddresses("192.168.1.9:11211 192.168.1.10:11211"));
String result;
for (int j = 0; j < 100; j++) {
for (int i = 0; i < 100000; i++) {
c.set("hello" + i, 0, "world" + i);
result = (String) c.get("hello" + i);
}
}
}
}
During the code execution shutdown any node via terminal ($sudo
/etc/init.d/membase-server stop) and you will get exception that ruins
everything. If you do "failover" via Membase UI - spymemcached detects node
failure properly and acts as expected (tried to know to failed node for some
while and then switches to alive node).
p.s. Discussion about the same issue:
http://www.couchbase.org/forums/thread/any-good-example-java-code-handles-node-f
ault#comment-1003508
Original comment by bouri...@gmail.com
on 23 Jun 2011 at 5:57
>The way Membase works, the failure of a node will cause errors at the client
level. Are you saying you cannot get any operations through to any other node
of the cluster?
Yes. After manual shut down of a random node client cannot access any other
nodes.
>Do things recover when you hit the "failover" button in Membase's web UI?
Via UI everything ok. Failover via UI works fine.
>If you believe there is an issue here, please post a test that demonstrates
what you think is wrong.
Already posted.
Original comment by bouri...@gmail.com
on 23 Jun 2011 at 7:12
Any suggestions for this issue?
Original comment by bouri...@gmail.com
on 27 Jun 2011 at 4:31
I'm reopening this for further investigation so it doesn't get lost.
Original comment by dsalli...@gmail.com
on 27 Jun 2011 at 8:14
This issue has been addressed in 2.7.2. The problem was, it wouldn't get an
updated configuration.
Original comment by ingen...@gmail.com
on 14 Oct 2011 at 7:19
And where is the example?
Original comment by bouri...@gmail.com
on 14 Oct 2011 at 7:42
Well, this is an issue tracking system, not a FAQ system. :)
In 2.7.2, I've added a test and verified if the list of URIs has down/dead
nodes in it, it will still find a live node and configure itself to do the
right thing. If the cluster topology changes, it then adjusts to the topology
and does the right thing.
That said, I just found that that commit was forgotten. I'll need to fix that.
Here comes 2.7.3.
You can see the change here:
http://review.couchbase.org/#change,10026
Original comment by ingen...@gmail.com
on 14 Oct 2011 at 7:53
That change has now been committed and is in the 2.7.3 release.
Original comment by ingen...@gmail.com
on 15 Oct 2011 at 3:07
Original issue reported on code.google.com by
bouri...@gmail.com
on 16 Jun 2011 at 6:04