ApolloAuto / apollo

An open autonomous driving platform
Apache License 2.0
25.12k stars 9.7k forks source link

Apollo bootstrap.sh Error #5344

Closed Triangle001 closed 5 years ago

Triangle001 commented 6 years ago

After built Apollo sucessfully,I use the command "bash scripts/bootstrap.sh" to launch to Apollo,I was refused,it shows:

============================ [ OK ] Build passed! [INFO] Took 2415 seconds

apollo@in_dev_docker:/apollo$ bash scripts/bootstrap.sh Started supervisord with dev conf Start roscore... voice_detector: started unix:///tmp/supervisor.sock refused connection

How should i do? Thanks.

DongAi commented 6 years ago

When I was testing Apollo, I run into a similiar issue, and found out a way that might help solve your problem. First case:

  1. type command supervisorctl, make sure that could enter supervisorctl interpreter environment, if an error like "unix:///tmp/supervisor.sock no such file" occurs, it means that you haven't started supervisord service.
  2. use command /usr/bin/python /usr/local/bin/supervisord to start it. and then try step 1 again. you can use commandsupervisorctl version to check the version, the version Apollo uses is 3.3.3 or 3.3.4.

Second case:

  1. check whether you have /etc/supervisord.conf file, if not, that's the root of the issue, jump to step 2
  2. there should be an executable file named echo_supervisord_conf in /usr/bin/ directory, use command : echo_supervisord_conf > /etc/supervisord.conf to create that necessary file.
  3. Go to the step 2 of First case to start supervisord service

Another tips:

  1. useps aux |grep supervisord to make sure supervisord service is running
  2. if there is supervisor with version 3.0 installed on your machine, you might need to remove it. and install version 3.3.3 or version 3.3.4, but not necessary, being able to use is our goal.
  3. for installing new version, check website supervisor
  4. check file apollo/modules/tools/supervisord/release.conf to see why supervisorctl could start a service named dreamview, in section [program:dreamview]
Triangle001 commented 6 years ago

Thanks.The problem has been solved.I change the config file: "apollo\modules\tools\supervisord\dev.conf" like this: in line ;[inet_http_server] ; inet (TCP) server disabled by default ;port=127.0.0.1:9001 ; ip_address:port specifier, :port for all iface cancel the line first ";": [inet_http_server] ; inet (TCP) server disabled by default port=127.0.0.1:9001 ; ip_address:port specifier, :port for all iface

So supervisor support web communicates with server by http .

natashadsouza commented 6 years ago

Closing this issue as the problem seems to be solved. Feel free to open it if you have additional questions.

IneverStop commented 6 years ago

I have the same problem, and try @Triangle001 s solution but still not work. Then I tried @DongAi s solution, still not work. When I type supervisorctl, it shows

unix:///tmp/supervisor.sock refused connection supervisor>

Seems it can enter the supervisor command line ,but have some problem.

Then I type /usr/bin/python /usr/local/bin/supervisord, it says:

/usr/local/lib/python2.7/dist-packages/supervisor/options.py:298: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security. 'Supervisord is running as root and it is searching ' Unlinking stale socket /tmp/supervisor.sock

Besides, my /tmp/supervisor.sock is empty.

Then I type supervisorctl, nothing changed.

I tried supervisorctl version, still says:

unix:///tmp/supervisor.sock refused connection

I will be very appreciate if someone help me.

IneverStop commented 6 years ago

I have solved this problem.My apollo version is r3.0.0, it has much things to do to solve this problem.It should also change /etc/supervisord.conf.

natashadsouza commented 6 years ago

That's great! If possible, please share the additional steps you mentioned to help other developers stuck on the same issue. Thanks!

whuzxy commented 6 years ago

@IneverStop Hi,I met the same problem,could you share your steps?

IneverStop commented 6 years ago

@whuzxy Sorry, I didn't check github these days.Change both file /apollo/modules/tools/supervisord/dev.conf and /etc/supervisord.conf as @Triangle001 says, this problem can be easily solved.

whuzxy commented 6 years ago

@IneverStop thank you ,i solved the issue just as you said.

natashadsouza commented 6 years ago

@IneverStop thank you very much for sharing the fix.

CCodie commented 6 years ago

@IneverStop @whuzxy @natashadsouza Hi, I have exactly the same error, can you help me ?? I installed --branch r3.0.0 and I changed both of /apollo/modules/tools/supervisord/dev.conf and /etc/supervisord.conf like below.

;[unix_http_server]
;file=/tmp/supervisor.sock   ; the path to the socket file
;chmod=0700                 ; socket file mode (default 0700)
;chown=nobody:nogroup       ; socket file uid:gid owner
;username=user              ; default is no username (open server)
;password=123               ; default is no password (open server)

[inet_http_server]         ; inet (TCP) server disabled by default
port=127.0.0.1:9001        ; ip_address:port specifier, *:port for all iface
;username=user              ; default is no username (open server)
;password=123               ; default is no password (open server)

...

[supervisorctl]
;serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL  for a unix socket
serverurl=http://127.0.0.1:9001 ; use an http:// url to specify an inet socket
;username=chris              ; should be same as in [*_http_server] if set
;password=123                ; should be same as in [*_http_server] if set
;prompt=mysupervisor         ; cmd line prompt (default "supervisor")
;history_file=~/.sc_history  ; use readline history if available

But when I'm trying to run bash scripts/bootstrap.sh the error occurs.

@in_dev_docker:/apollo$ bash scripts/bootstrap.sh 
Started supervisord with dev conf
Start roscore...
voice_detector: started
dreamview: ERROR (spawn error)

I would really appreciate it if you could help. Thanks !

DongAi commented 6 years ago

Hi CCodie, could you please display the error output of dreamview in file data/log/dreamview.ERROR?

CCodie commented 6 years ago

@DongAi Hi, thank you for the reply. Actually there's no log file about dreamview.ERROR. There's only ERROR file about monitor. monitor.ERROR

Log file created at: 2018/10/12 13:09:47
Running on machine: in_dev_docker
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E1012 13:09:47.067443 11173 can_checker_factory.cc:48] Failed to create CAN checker with parameter: brand: ESD_CAN
type: PCI_CARD
channel_id: CHANNEL_ID_ZERO
E1012 13:10:17.393507 11173 can_checker_factory.cc:48] Failed to create CAN checker with parameter: brand: ESD_CAN
type: PCI_CARD
channel_id: CHANNEL_ID_ZERO
E1012 13:10:47.630934 11173 can_checker_factory.cc:48] Failed to create CAN checker with parameter: brand: ESD_CAN
type: PCI_CARD
channel_id: CHANNEL_ID_ZERO
E1012 13:11:17.862692 11173 can_checker_factory.cc:48] Failed to create CAN checker with parameter: brand: ESD_CAN
type: PCI_CARD
channel_id: CHANNEL_ID_ZERO
E1012 13:11:48.021847 11173 can_checker_factory.cc:48] Failed to create CAN checker with parameter: brand: ESD_CAN
type: PCI_CARD
channel_id: CHANNEL_ID_ZERO
E1012 13:12:18.266692 11173 can_checker_factory.cc:48] Failed to create CAN checker with parameter: brand: ESD_CAN
type: PCI_CARD
channel_id: CHANNEL_ID_ZERO

But this log seems about the CAN Card, doesn't relate with my error. Can you give me another way to solve the problem ?? Thanks !

CCodie commented 6 years ago

@DongAi , here is my update info. I typed supervisorctl after running bash scripts/bootstrap.sh with modified files /apollo/modules/tools/supervisord/dev.conf , /etc/supervisord.conf.

@in_dev_docker:/apollo$ supervisorctl
canbus                           STOPPED   Not started
conti_radar                      STOPPED   Not started
control                          STOPPED   Not started
dreamview                        FATAL     Exited too quickly (process log may have details)
gps                              STOPPED   Not started
localization                     STOPPED   Not started
mobileye                         STOPPED   Not started
monitor                          RUNNING   pid 9248, uptime 0:36:09
navigation_control               STOPPED   Not started
navigation_localization          STOPPED   Not started
navigation_perception            STOPPED   Not started
navigation_planning              STOPPED   Not started
navigation_prediction            STOPPED   Not started
navigation_routing               STOPPED   Not started
navigation_server                STOPPED   Not started
open_api                         STOPPED   Not started
perception                       STOPPED   Not started
planning                         STOPPED   Not started
prediction                       STOPPED   Not started
routing                          STOPPED   Not started
third_party_perception           STOPPED   Not started
supervisor> 

Deos this can help for debugging my error ?

DongAi commented 6 years ago

Hi, CCodie, in this field,

[unix_http_server]
;file=/tmp/supervisor.sock   ; the path to the socket file

I think this line shouldn't be commented. But as you have started supervisord, so maybe it's not this matter.

If dreamview crashed, a core file would be generated in directory data/core/, we maybe need to debug to find what caused this crash.

below are the commands to use gdb to attach and debug: $ gdb bazel-bin/modules/dreamview/dreamview data/core/#yourcorefilename# after attaching successfully, use command: $ bt to see the callback stack. you can put the ouput here so that we can help to analyse.

And, there's still another way to escape from this error of supervisor. we can modify file script/bootscript.sh to devide whether to use supervisorto manager our processes or not.

here is a part of my bootscript.sh:

function start() {
    DEBUG_MODE="yes"
    if [ "$HOSTNAME" == "in_release_docker" ]; then
        DEBUG_MODE="no"
    fi

    # Start roscore.
    bash scripts/roscore.sh start

    if [ "$DEBUG_MODE" == "yes" ]; then
        ./scripts/monitor.sh start
        ./scripts/dreamview.sh start
        supervisord -c /apollo/modules/tools/supervisord/dev.conf >& /tmp/supervisord.start.log
        echo "Started supervisord with dev conf"
    else
whuzxy commented 6 years ago

@CCodie I don't know why but it really worked,you can have a try. /apollo/modules/tools/supervisord/dev.conf

[unix_http_server]
file=/tmp/supervisor.sock   ; the path to the socket file
chmod=0700                 ; socket file mode (default 0700)
;chown=nobody:nogroup       ; socket file uid:gid owner
;username=user              ; default is no username (open server)
;password=123               ; default is no password (open server)

[inet_http_server]         ; inet (TCP) server disabled by default
port=127.0.0.1:9001        ; ip_address:port specifier, *:port for all iface
;username=user              ; default is no username (open server)
;password=123               ; default is no password (open server)

...

[supervisorctl]
;serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL  for a unix socket
serverurl=http://127.0.0.1:9001 ; use an http:// url to specify an inet socket
;username=chris              ; should be same as in [*_http_server] if set
;password=123                ; should be same as in [*_http_server] if set
;prompt=mysupervisor         ; cmd line prompt (default "supervisor")
;history_file=~/.sc_history  ; use readline history if available

/etc/supervisord.conf

[unix_http_server]
file=/tmp/supervisor.sock   ; the path to the socket file
;chmod=0700                 ; socket file mode (default 0700)
;chown=nobody:nogroup       ; socket file uid:gid owner
;username=user              ; default is no username (open server)
;password=123               ; default is no password (open server)

[inet_http_server]         ; inet (TCP) server disabled by default
port=127.0.0.1:9001        ; ip_address:port specifier, *:port for all iface
;username=user              ; default is no username (open server)
;password=123               ; default is no password (open server)

...

[supervisorctl]
;serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL  for a unix socket
serverurl=http://127.0.0.1:9001 ; use an http:// url to specify an inet socket
;username=chris              ; should be same as in [*_http_server] if set
;password=123                ; should be same as in [*_http_server] if set
;prompt=mysupervisor         ; cmd line prompt (default "supervisor")
;history_file=~/.sc_history  ; use readline history if available

I just add

chmod=0700

in /apollo/modules/tools/supervisord/dev.conf than you.But it worked for me . Good luck.

CCodie commented 6 years ago

@DongAi Hi, I still got the error for running dreamview. Can you please give me a help?

$ bash scripts/bootstrap.sh 
Started supervisord with dev conf
Start roscore...
voice_detector: started
dreamview: ERROR (spawn error)

I hope the below can help me to debug this error... $ gdb bazel-bin/modules/dreamview/dreamview data/core/core_dreamview.27435

GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.3) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from bazel-bin/modules/dreamview/dreamview...done.
[New LWP 27435]
[New LWP 27436]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

warning: the debug information found in "/home/caros/secure_upgrade/depend_lib/libyaml-cpp.so.0.5.1" does not match "/home/caros/secure_upgrade/depend_lib/libyaml-cpp.so.0.5" (CRC mismatch).

Core was generated by `/apollo/bazel-bin/modules/dreamview/dreamview --flagfile=/apollo/modules/dreamv'.
Program terminated with signal SIGILL, Illegal instruction.
#0  0x00007fe3adf28bec in double boost::math::detail::erf_inv_imp<double, boost::math::policies::policy<boost::math::policies::promote_float<false>, boost::math::policies::promote_double<false>, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math:---Type <return> to continue, or q <return> to quit---bt
:policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy> >(double const&, double const&, boost::math::policies::policy<boost::math::policies::promote_float<false>, boost::math::policies::promote_double<false>, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy> const&, mpl_::int_<64> const*) ()
   from /usr/local/lib/libpcl_sample_consensus.so.1.7
(gdb) bt
#0  0x00007fe3adf28bec in double boost::math::detail::erf_inv_imp<double, boost::math::policies::policy<boost::math::policies::promote_float<false>, boost::math::policies::promote_double<false>, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy> >(double const&, double const&, boost::math::policies::policy<boost::math::policies::promote_float<false>, boost::math::policies::promote_double<false>, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy, boost::math::policies::default_policy> const&, mpl_::int_<64> const*) ()
   from /usr/local/lib/libpcl_sample_consensus.so.1.7
#1  0x00007fe3adeedf1e in _GLOBAL__sub_I_sac.cpp ()
   from /usr/local/lib/libpcl_sample_consensus.so.1.7
#2  0x00007fe3b6afd2da in call_init (l=<optimized out>, argc=argc@entry=2, 
    argv=argv@entry=0x7fff847f2d68, env=env@entry=0x7fff847f2d80)
    at dl-init.c:78
#3  0x00007fe3b6afd3c3 in call_init (env=<optimized out>, 

I actually don't know why this error occur for my laptop because I successfully ran Apollo ver3.0 on my desktop. Anyway thanks in advance !

add) Can you please type all the contents of bootstrap.sh function start ? I also want to try your 2nd guide.

DongAi commented 6 years ago

Hi, CCodie, from your gdb output, I think one of my colleagues has ever encountered this issue, it may be caused by the incompatibility between PCL lib and your cpu. You may need to recompile the PCL lib.

And you can get more information about this similar issue referring to #3615, #4135.

And please refer to pcl doc to get information about how to build PCL. Please note that, it's better to build Release version of PCL sinch building Debug version may take your 3-4 hours.

Aftering building, replace the PCL libraries exist in you docker contaners with the newly built libraries, the directory that contains the PCL libraries is /usr/local/lib

DongAi commented 6 years ago

Hi, CCodie, Another tip, the build process must run in your docker container.

CCodie commented 6 years ago

@DongAi Hi, I'm really appreciate for your reply. I'm going to try with PCL lib and after that, I'll leave a comment here. Thanks !

DongAi commented 6 years ago
function start() {
    DEBUG_MODE="yes"
    if [ "$HOSTNAME" == "in_release_docker" ]; then
        DEBUG_MODE="no"
    fi

    # Start roscore.
    bash scripts/roscore.sh start

    if [ "$DEBUG_MODE" == "yes" ]; then
        ./scripts/monitor.sh start
        ./scripts/dreamview.sh start
    supervisord -c /apollo/modules/tools/supervisord/dev.conf >& /tmp/supervisord.start.log
        echo "Started supervisord with dev conf"
    else
        # Use supervisord.
        supervisord -c /apollo/modules/tools/supervisord/release.conf >& /tmp/supervisord.start.log
        echo "Started supervisord with release conf"

        # Start monitor.
        supervisorctl start monitor > /dev/null
        # Start dreamview.
        supervisorctl start dreamview
        supervisorctl status dreamview | grep RUNNING > /dev/null
    fi

    if [ $? -eq 0 ]; then
        echo "Dreamview is running at http://localhost:8888"
    fi
}
CCodie commented 6 years ago

@DongAi I'm really appreciate for your detail instructions. I found out that error was caused by incompatibility between my CPU and PCL lib. Now I exactly solve my problem. Again, thank you :)

Add) Also we can close this issue.