Open rmoff opened 7 years ago
We kind of ran into the same issue before and the only way we recovered was manually kill the processes listed in ps -ef command and start the stack again.
When you are running confluent status
does confluent current
or echo $CONFLUENT_CURRENT
(if it's set) point to the runtime directory of the deployment that is currently running?
If you've set CONFLUENT_CURRENT
but you attempted to run confluent status
from a terminal that doesn't have this env var set, the CLI doesn't have a way to find the descriptors for the currently running services. You might want to use lsof
to figure out what that directory of the running services.
In my case - CONFLUENT_CURRENT is not set. But, from this link, if it is not set, it defaults to /tmp
confluent current - does show the runtime dir from /tmp
Here is an observation/issue we are facing.
root user -- confluent start . (successful) root user -- confluent status . (shows all services are UP) root user -- confluent current (shows /tmp/confluent.######)
non-root user log into the same server while the services are up and running.
non-root user -- confluent status (shows all services are DOWN) non-root user -- sudo confluent status (shows all services are UP) non-root user -- confluent current (shows same /tmp/confluent.###### as above).
What I did notice is by default - /tmp/confluent.###### has rwx------ permission for root (or any user that starts the service). So, no other users are unable to read that dir or files in it. confluent.current also has rwx------ permission - again owned and accessible only owner (in this case root).
Note: I did yum install confluent package as root. Not sure if that has any implication.
I am also facing the same issue with non root user but its fine for root user.
I also faced the same issue, which means zookeeper is running from init.d so just sudo service zookeeper stop , try it , if it works then its relaxing.
Hitting this issue again. Seems to be different terminal sessions end up with different CONFLUENT_CURRENT
values, all based on permutations of /var/folders/q9/2tg_lt9j6nx29rvr5r5jn_bw0000gp/T/confluent.xxxxxxx
I'm definitely not doing anything to set CONFLUENT_CURRENT
myself.
Having to wheel out this rather nasty way of killing things:
ps -ef|grep confluent.|grep -v grep|awk '{print $2}'|xargs -Ifoo kill -9 foo
I have the same problem. 'confluent status' return [DOWN], 'confluent stop', 'confluent log' doesn't work...
I just found that there are 2 confluent current running folders under /tmp. I checked that one of the folder is empty and one of them contains files of the current running Confluent instance. When I do a 'confluent current', it returns the name of the empty folder!!! I noticed that the file /tmp/confluent.current has something to do with the confluent cli. I updated the file to match with the current running kafka instance and 'confluent log kafka' now works again. But, confluent status still doesn't work...
To workaround the issue, always run the confluent cli from /tmp (or $CONFLUENT_CURRENT if defined) Or update bin/confluent as below
... [[ $# -lt 1 ]] && usage
requirements
cd $confluent_current_dir command="${1}" ...
I am using confluent 4.
I encountered this issue, I tried following. It works! I am using confluent oss 5.0.0 Problem: user@user-Lenovo-G400:~$ confluent start This CLI is intended for development only, not for production https://docs.confluent.io/current/cli/index.html Using CONFLUENT_CURRENT: /home/user/confluent-5.0.0/confluent.0C1Oma4q Starting zookeeper Zookeeper failed to start zookeeper is [DOWN] Cannot start Kafka, Zookeeper is not running. Check your deployment
Solution: user@user-Lenovo-G400:~ sudo /home/user/confluent-5.0.0/bin/zookeeper-server-stop
user@user-Lenovo-G400:~$ confluent start This CLI is intended for development only, not for production https://docs.confluent.io/current/cli/index.html
Using CONFLUENT_CURRENT: /home/user/confluent-5.0.0/confluent.0C1Oma4q Starting zookeeper zookeeper is [UP] Starting kafka kafka is [UP] Starting schema-registry schema-registry is [UP] Starting kafka-rest kafka-rest is [UP] Starting connect connect is [UP] Starting ksql-server ksql-server is [UP] user@user-Lenovo-G400:~$
May be someone may find it useful!
Seems this issue is still there. confluent status does not seem to work.
But it's clearly running:
This was after numerous days suspending/unsuspending my laptop, having previously started the stack up.
This issue causes two problems:
confluent log zookeeper
shows:I don't quite know how my setup got into the state it did, but the CLI needs to improve how it detects if processes are running or not.