Closed taazza closed 15 years ago
When did you update/install your Nanite gem? The current version on gemcutter.org is 0.4.12, and I've never seen that happen.
On a second note, I'll see how I go with the AMQP 0.6.5 gem today, but still, I'd encourage you to update your Nanite installation.
We use gem bundler and the current version of nanite on gemcutter is 4.1.10 http://gemcutter.org/gems/nanite Where are seeing 0.4.12? Am I missing something here?
The 0.4.1.2 version is right there in the list. Version 0.4.1.10 is not the official Nanite gem. I'm afraid it's the RightScale fork and it's full of custom patches for the RightScale product and not properly tested from my point of view. Please install 0.4.1.2, and I'll talk to Ezra how that version ended up on Gemcutter.
Aah... 0.4.1.2! I was looking for 0.4.1 [12] as you had mentioned earlier.
When someone installs nanite 0.4.1.[10] gets selected by default. No worries I will give this a shot and hopefully the problem disappears!
I'll try to push an updated gem later today.
Thanks! Pls try and get the logging issue in as well ;) You help and prompt responses have been very helpful! Thanks a bunch! Pls close both issues once you are done with the build & push.
I assuming the updated Gem will be posted on gemcutter. Thanks again!
The gem on gemcutter has been updated. Let me know if there are any problems.
No such luck. Tested it out with nanite-0.4.1.13 and after running for a few hours it runs into the same problem. Exception attached below
/home/test/v_0.1/vendor/gems/gems/amqp-0.6.5/lib/amqp/buffer.rb:252:in min': comparison of Array with Array failed (ArgumentError) from /home/test/v_0.1/vendor/gems/gems/nanite-0.4.1.13/lib/nanite/cluster.rb:132:in
each'
from /home/test/v_0.1/vendor/gems/gems/nanite-0.4.1.13/lib/nanite/cluster.rb:132:in min' from /home/test/v_0.1/vendor/gems/gems/nanite-0.4.1.13/lib/nanite/cluster.rb:132:in
least_loaded'
from /home/test/v_0.1/vendor/gems/gems/nanite-0.4.1.13/lib/nanite/cluster.rb:22:in __send__' from /home/test/v_0.1/vendor/gems/gems/nanite-0.4.1.13/lib/nanite/cluster.rb:22:in
targets_for'
from /home/test/v_0.1/vendor/gems/gems/nanite-0.4.1.13/lib/nanite/mapper.rb:193:in send_request' from /home/test/v_0.1/vendor/gems/gems/nanite-0.4.1.13/lib/nanite/mapper.rb:186:in
request'
from tester.rb:58:in start' from /home/test/v_0.1/vendor/gems/gems/eventmachine-0.12.10/lib/em/timers.rb:51:in
call'
from /home/test/v_0.1/vendor/gems/gems/eventmachine-0.12.10/lib/em/timers.rb:51:in fire' from /home/test/v_0.1/vendor/gems/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in
call'
from /home/test/v_0.1/vendor/gems/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in run_machine' from /home/test/v_0.1/vendor/gems/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in
run'
from tester.rb:41:in `start'
from tester.rb:70
Had to reboot the machine.
As for logging .. The mapper is all set, INFO issue has disappeared. But the agent still logs the request as INFO
[Sat, 21 Nov 2009 03:35:44 -0500] INFO: SEND [result] <9119b16dc7d01d87ea61e42753b6c0be> [Sat, 21 Nov 2009 03:35:44 -0500] INFO: RECV [result] <9119b16dc7d01d87ea61e42753b6c0be>
leading big log files. Pls reopen this issue. Thx
Are you using Redis as state storage?
Somehow the status of an agent comes out as an array from the state storage. It would help me to find out what's going on if you could patch the cluster.rb at line 132 to output a[1] and b[1]. Otherwise it'd get hard for me to debug. I'll have a hard look at the data coming into the state store, but it'd be easier to figure out.
I'll look into the agent logging as well, I thought I got them all.
Nope, not using Redis. Let me patch and rebuild the gem and test it out.
I will send you the logs soon. I dont understand why I have to restart the machine for the problem to disappear. Anyways, thanks for taking a look at the issue, we are out of bandwidth to contribute at the moment.
We will pitch in soon. Thanks for all your effort/help. Cheers!
I printed the candidates variable
When you start the mapper and every thing is fine Here is what gets printed.
INFO: [ARGUMENT_ERROR_PATCH] candidates -> nanite-SMEBARUTHI timestamp1258956831 tags status0.0 services/masala/process/thadka/process/lao/process/test/execute/vayudooth/process/khale/process/thadayam/process nanite-ROJA timestamp1258956827 tags status0.0 services/masala/process/thadka/process/lao/process/test/execute/vayudooth/process/khale /process/thadayam/process
And when things go wrong and array compare failed error pops up this is what gets printed
INFO: [ARGUMENT_ERROR_PATCH] candidates -> nanite-SMEBARUTHI timestamp1259006907 tags statusno status [THIS SEEMS TO BE THE ISSUE - no value instead [no status] gets printed]
services/masala/process/thadka/process/lao/process/test/execute/vayudooth/process/khale/process/thadayam/process nanite-ROJA timestamp1259006915 tags status0.46 services/masala/process/thadka/process/lao/process/test/execute/vayudooth/process/khale/process/thadayam/process
Hope this helps.
Thanks, that does help. I'll look into it.
Sorry for the delay on this one. The problem seems to be that your agent is incapable of executing the command uptime on the machine it's running. What operating system is it, or what happens when you fire up a small Ruby script and just put uptime
in it? Either way, the mapper needs to be fixed to not use the status value when it's just "no status".
Matt, we are on Ubuntu 8.0.4 hardy release. When we re-fire the mapper, it runs for a while before it runs into the problem again.
This repeats till we reboot the system.
Could you try overwriting the default status proc with a debug message, so I can see what the problem might be? Would be nice to fix the root cause of this. Need to change this in the agent's init.rb file, and then watch the log file when it happens again.
status_proc = lambda do
begin
parse_uptime(uptime
)
rescue
Nanite::Log.error($!)
'no status'
end
end
Thanks!
Not sure if this is an amqp error or a nanite error, I have posted it on amqp as well.
/vendor/gems/gems/amqp-0.6.5/lib/amqp/buffer.rb:252:in
min': comparison of Array with Array failed (ArgumentError) from /home/test/v_0.1/vendor/gems/gems/nanite-0.4.1.10/lib/nanite/cluster.rb:137:in
each' from /home/test/v_0.1/vendor/gems/gems/nanite-0.4.1.10/lib/nanite/cluster.rb:137:inmin' from /home/test/v_0.1/vendor/gems/gems/nanite-0.4.1.10/lib/nanite/cluster.rb:137:in
least_loaded' from /home/test/v_0.1/vendor/gems/gems/nanite-0.4.1.10/lib/nanite/cluster.rb:23:in__send__' from /home/test/v_0.1/vendor/gems/gems/nanite-0.4.1.10/lib/nanite/cluster.rb:23:in
targets_for' from /home/test/v_0.1/vendor/gems/gems/nanite-0.4.1.10/lib/nanite/mapper.rb:198:insend_request' from /home/test/v_0.1/vendor/gems/gems/nanite-0.4.1.10/lib/nanite/mapper.rb:191:in
request' from base_prog.rb:58:instart' from /home/test/v_0.1/vendor/gems/gems/eventmachine-0.12.10/lib/em/timers.rb:51:in
call' from /home/test/v_0.1/vendor/gems/gems/eventmachine-0.12.10/lib/em/timers.rb:51:infire' from /home/test/v_0.1/vendor/gems/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in
call' from /home/test/v_0.1/vendor/gems/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:inrun_machine' from /home/test/v_0.1/vendor/gems/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in
run' from base_prog.rb:41:in `start' from base_prog.rb:70This happens a lot. And when it happens it continues to happen repeatedly every couple of minutes till a restart is done. Wondering if this has to do with rabbitmq/amqp or the state of the nanite.
Any thoughts would be greatly appreciated. Thanks!