Open GoogleCodeExporter opened 9 years ago
I tried reproducing the scenario:-
I started Asterix using Managix, opened web-interface and ran a query. The
instance is ACTIVE and I have the web-interface open when I do the following:-
Attempt 1: Manually put mac book to sleep
Break the sleep using the usual way.
Attempt 2: Modify sleep timer to the minimum (1 min) and watch the Macbook go
to sleep
Break the sleep as before.
In either case, I was not able to reproduce the scenario. The web-interface was
accessible and I could continue executing queries.
Mike is there something I am missing or probably I can get access to your
system when we meet next and understand the difference in behavior.
Original comment by RamanGro...@gmail.com
on 21 Apr 2013 at 5:31
Here are the logs in the case where (1) worked but it was hung w/o that.
Original comment by dtab...@gmail.com
on 21 Apr 2013 at 9:30
Attachments:
Got the logs this time we hope - here is the new CC log
Original comment by dtab...@gmail.com
on 22 Apr 2013 at 7:53
Attachments:
Original comment by vinay...@gmail.com
on 24 May 2013 at 8:38
Here is the status AFAMIC when the system has changed networks:
==> managix describe -n my_asterix
INFO: Name:my_asterix
Created:Fri May 31 13:37:21 PDT 2013
Web-Url:http://127.0.0.1:19001
State:ACTIVE (Fri May 31 16:01:47 PDT 2013)
So - UNUSABLE is not the norm. :-)
Original comment by dtab...@gmail.com
on 31 May 2013 at 11:27
Original comment by vinay...@gmail.com
on 31 May 2013 at 11:40
As discussed, marking it as 'Won't fix' but would re-open this if the issue is
seen again.
Original comment by RamanGro...@gmail.com
on 16 Nov 2013 at 2:54
I've seen this "UNSTABLE state" problem after rebooting my machine. The cc log
does not have anything useful, but the latest log was on January 20th which is
also my last reboot date, and didn't have anything within the following 7 days.
Can you check if you will be able to reproduce this problem by rebooting your
machine while the web-interface was accessible?
Btw, running "managix shutdown" changed the status to "INACTIVE", then I could
use "managix start" to restart my instance. It might be a good idea to add this
troubleshooting to FAQ's.
Original comment by icetin...@gmail.com
on 27 Jan 2014 at 11:44
managix shutdown simply shuts the backend zookeeper service gracefully and is
useful when one is not using AsterixDB and does not want any daemon management
process to continue running in the background. It does not touch an AsterixDB
instance.
Post the shutdown command, the start command would internally translate to a)
start back-end zookeeper as its not running b) start the asterix instance. Step
(a) here allows zookeeper to run local recovery based on its own logs and
ensures that all data is consistent. Post recovery of Zookeeper state, when an
Asterix instance is started, no issues are observed in updating and reading the
state maintained in Zookeeper.
I would attempt at replicating what you observed.
Original comment by ram...@uci.edu
on 28 Jan 2014 at 5:16
Hello, I tried to follow the steps on
http://asterix.ics.uci.edu/documentation/install.html and after the step
managix create -n my_asterix -c $MANAGIX_HOME/clusters/local/local.xml
I get:
INFO: Name:my_asterix
Created:Sat Apr 26 23:06:08 EDT 2014
Web-Url:http://127.0.0.1:19001
State:UNUSABLE
WARNING!:Cluster Controller not running at master
Original comment by getajo...@gmail.com
on 27 Apr 2014 at 3:14
Original comment by dtab...@gmail.com
on 27 Apr 2014 at 4:15
I assume the validation step (managix validate -c <path to cluster
configuration xml>) was successful.
I am missing some critical information in ascertaining the reason for failure.
Can you please provide the logs.
If you have not changed the local.xml file that is auto-generated, these logs
would be found at
$MANAGIX_HOME/clusters/local/working_dir/logs/
It would be helpful if we can set up a Skype session (skype id: raman-grover).
It should not take long to look at the environment and figure out the cause.
Please let me know your availability (morning or evening slots are preferred as
I am currently in the Indian Time Zone (PDT + 12:30)).
Original comment by ram...@uci.edu
on 27 Apr 2014 at 5:02
I feel like the answer may be trivial (hopefully). Unfortunately I'm a lowly
CS student who doesn't understand much about the problem or AsterixDB in
general.
Full context: I was first having some issues with my Java version (as things
were defaulting to the Apple Java 1.6 version). After I got that fixed I was
getting this error. I ended up deleting everything in my asterix_mgmt
directory and retrying with a fresh download, so the logs will not give the
full story.
Anyway, I'm available most times tomorrow for a Skype call. I'm on the Eastern
Time Zone (PDT + 3:00)
Original comment by getajo...@gmail.com
on 27 Apr 2014 at 3:30
Attachments:
From you CC logs,
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:414)
at sun.nio.ch.Net.bind(Net.java:406)
The above error suggests that you are already having a process that is
occupying the port 1098.
From your NC logs:
NFO: Completed sharp checkpoint.
java.lang.Exception: Node with this name already registered.
at edu.uci.ics.hyracks.control.cc.work.RegisterNodeWork.doRun(RegisterNodeWork.java:58)
at edu.uci.ics.hyracks.control.common.work.SynchronizableWork.run(SynchronizableWork.java:32)
at edu.uci.ics.hyracks.control.common.work.WorkQueue$WorkerThread.run(WorkQueue.java:116)
From your NC logs, it shows that the NC was able to connect to a CC process but
the CC had already received a ping from an NC with same name.
Looking at the logs from CC and NC, it seems to be the case that you already
have processes running. These were daemons started as part of an initial
attempt to create an asterix instance. Did you stop the previous instance
before wiping out stuff (MANAGIX_HOME) before attempting a re-install.
Executing a jps on the command prompt would listify the running java processed
from your account.
These should not have CCDriver or NCDriver before you attempt to create an
instance. Please confirm.
Original comment by RamanGro...@gmail.com
on 27 Apr 2014 at 4:02
[deleted comment]
My memory may be failing me. If I remember correctly, I attempted to managix
stop my_asterix before reinstalling, but the system wouldn't allow me to
because it was UNUSABLE. However, that may have been the message when I tried
to start? Anyway...
Now that I'm at the command prompt again, I can officially managix stop
my_asterix. After doing so, jps lists CCDriver (but not NCDriver).
Original comment by getajo...@gmail.com
on 27 Apr 2014 at 4:35
managix stop -n <name of instance> is a valid command in the UNUSABLE state.
It transforms the instance to INACTIVE state after terminating whichever
daemons are alive.
as you reported, jps is showing CCDrive process, I would ask you to terminate
it using kill -9 <process id>
where the process id is shown as the output of the jps command.
Once you have the clean system (jps does not show CCDriver/NCDriver), please
go ahead and re try.
Also if possible, ping me on skype (raman-grover) and I can have a live
(support) session.
Original comment by RamanGro...@gmail.com
on 27 Apr 2014 at 4:52
I can get on skype in about an hour. In the mean time:
working_dir$ jps
2488 Jps
working_dir$ managix create -n my_asterix -c
/Users/cameronbasham/s/databases/o/asterix-mgmt/clusters/local/local.xml
INFO: Name:my_asterix
Created:Sun Apr 27 13:37:04 EDT 2014
Web-Url:http://127.0.0.1:19001
State:UNUSABLE
WARNING!:Cluster Controller not running at master
Node Controller not running at the following nodes
127.0.0.1
Original comment by getajo...@gmail.com
on 27 Apr 2014 at 5:39
Attachments:
form the logs, I probably know whats happening here, would wait for you to be
online...
Original comment by RamanGro...@gmail.com
on 27 Apr 2014 at 6:49
I thought I added you, but I've yet to get a reply. I'm getajob92 on skype.
Original comment by getajo...@gmail.com
on 27 Apr 2014 at 6:52
Original issue reported on code.google.com by
dtab...@gmail.com
on 21 Apr 2013 at 2:56