ManageIQ / manageiq

ManageIQ Open-Source Management Platform
https://manageiq.org
Apache License 2.0
1.35k stars 898 forks source link

docker manageiq do not start after adding/removing network interface - memcached connectivity #17274

Closed gaetanquentin closed 5 years ago

gaetanquentin commented 6 years ago

i have launched the latest docker manageiq (20180404) container. it was working fine.

i had the "strange" idea to change network configuration, adding a second network interface, and removing it after (docker network connect / docker network disconnect)

Now the container do not start and i don't know why since the network conf looks like before.

The log error is:

Error something seems wrong, we need at least two parameters to check service status == Checking MIQ database status == ** DB already initialized

{"@timestamp":"2018-04-05T15:47:32.690234 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for evm.log has been changed to [INFO]"} {"@timestamp":"2018-04-05T15:47:32.690894 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for vim.log has been changed to [WARN]"} {"@timestamp":"2018-04-05T15:47:32.691549 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for rhevm.log has been changed to [INFO]"} {"@timestamp":"2018-04-05T15:47:32.692014 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for aws.log has been changed to [INFO]"} {"@timestamp":"2018-04-05T15:47:32.692380 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for kubernetes.log has been changed to [INFO]"} {"@timestamp":"2018-04-05T15:47:32.692814 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for datawarehouse.log has been changed to [INFO]"} {"@timestamp":"2018-04-05T15:47:32.693234 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for container_monitoring.log has been changed to [INFO]"} {"@timestamp":"2018-04-05T15:47:32.693607 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for scvmm.log has been changed to [INFO]"} {"@timestamp":"2018-04-05T15:47:32.693993 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for api.log has been changed to [INFO]"} {"@timestamp":"2018-04-05T15:47:32.694350 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for fog.log has been changed to [INFO]"} {"@timestamp":"2018-04-05T15:47:32.694738 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for azure.log has been changed to [WARN]"} {"@timestamp":"2018-04-05T15:47:32.695077 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for lenovo.log has been changed to [INFO]"} {"@timestamp":"2018-04-05T15:47:32.695500 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for websocket.log has been changed to [INFO]"} {"@timestamp":"2018-04-05T15:47:32.695935 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for vcloud.log has been changed to [INFO]"} {"@timestamp":"2018-04-05T15:47:32.696293 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(Vmdb::Loggers.apply_config) Log level for nuage.log has been changed to [INFO]"} {"@timestamp":"2018-04-05T15:47:33.068484 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"info","message":"MIQ(SessionStore) Using session_store: ActionDispatch::Session::MemCacheStore"} {"@timestamp":"2018-04-05T15:47:33.451305 ","hostname":"manageiq","pid":7,"tid":"3a7140","level":"warning","message":"127.0.0.1:11211 failed (count: 0) Errno::ECONNREFUSED: Connection refused - connect(2) for \"127.0.0.1\" port 11211"}

/usr/local/lib/ruby/gems/2.3.0/gems/dalli-2.7.6/lib/dalli/ring.rb:45:in server_for_key': No server available (Dalli::RingError) from /usr/local/lib/ruby/gems/2.3.0/gems/dalli-2.7.6/lib/dalli/client.rb:236:inalive!' from /usr/local/lib/ruby/gems/2.3.0/gems/dalli-2.7.6/lib/rack/session/dalli.rb:19:in initialize' from /usr/local/lib/ruby/gems/2.3.0/gems/actionpack-5.0.6/lib/action_dispatch/middleware/session/abstract_store.rb:32:ininitialize'

il looks like a memcache network connectivity problem.

What should i modify to make it start again?

Regards.

dalareo commented 6 years ago

I have a similar issue: not even able to restart the container after stopping it:


/usr/local/lib/ruby/gems/2.3.0/gems/dalli-2.7.6/lib/dalli/ring.rb:45:in `server_for_key': No server available (Dalli::RingError)

"message":"127.0.0.1:11211 failed (count: 0) Errno::ECONNREFUSED: Connection refused - connect(2) for \"127.0.0.1\" port 11211"}
ngoduykhanh commented 6 years ago

I faced the same issue with manageiq docker image.

{"@timestamp":"2018-08-16T05:06:01.028087 ","hostname":"fae6893c4776","pid":5,"tid":"c0d104","level":"warning","message":"127.0.0.1:11211 failed (count: 0) Errno::ECONNREFUSED: Connection refused - connect(2) for \"127.0.0.1\" port 11211"}
adamjpavlik commented 6 years ago

The default /manageiq/docker-assets/appliance-initialize.sh only starts memcached and postresql if the DB has not been initialized. See line 18.

I created local appliance-initialize.sh that include start commands for the aforementioned services if the DB is initialized. My local Dockerfile includes a Copy of this new file that overwrites the base image default.

isaaccarrington commented 6 years ago

I just pulled today and got the same issue on restart


{"@timestamp":"2018-10-17T06:24:27.703264 ","hostname":"0fa2b870b67d","pid":9,"tid":"1005108","level":"warning","message":"127.0.0.1:11211 failed (count: 0) Errno::ECONNREFUSED: Connection refused - connect(2) for \"127.0.0.1\" port 11211"}
Error something seems wrong, we need at least two parameters to check service status
== Checking MIQ database status ==
** DB already initialized
/usr/local/lib/ruby/gems/2.3.0/gems/dalli-2.7.6/lib/dalli/ring.rb:45:in `server_for_key': No server available (Dalli::RingError)```
eselvam commented 6 years ago

You need to change the file in two location in docker container: /usr/bin and /var/www/miq/vmdb/docker-assets. First spin up the manageiq with docker then use docker cp to copy the original file to Host then modify as per Guilrom and copy back to docker instance using same docker cp. Once copied back then restart the container. It should be working. I tested this approach and working fine.

eselvam commented 6 years ago

After making above changes, the httpd is not starting. we have to start it manually to access the manageiq web page.

/usr/sbin/httpd -DFOREGROUND &

I think we are missing some config here not sure how to find it. still checking. It is happening for only docker images. The vmware image i.e ova or ovf works fine.

miq-bot commented 5 years ago

This issue has been automatically marked as stale because it has not been updated for at least 6 months.

If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions!

keuko commented 5 years ago

Hello,

I have the same issue ... is there any progress ?

Thanks, Michal

eselvam commented 5 years ago

I could not fix it. I tried lot of options but no luck. On Tuesday, August 20, 2019, 06:54:53 PM GMT+5:30, Michal Arbet notifications@github.com wrote:

Hello,

I have the same issue ... is there any progress ?

Thanks, Michal

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

gaetanquentin commented 5 years ago

@eselvam none. this docker image is a one shot launch.... ;-) after you stopped it, unable to start it again. And they don't care.

tested with hammer-6, hammer-10

JPrause commented 5 years ago

@miq-bot remove_label stale @miq-bot remove_label pinned

chessbyte commented 5 years ago

@carbonin can you take a look at this?

carbonin commented 5 years ago

After #19463 is merged this should work for the most part. The only issue I was still having was that httpd might not start sometimes, but I was able to exec into the container and start it to get the UI accessible.

I think the issue is that even if you wait for the server to stop cleanly the other processes in the container will be killed uncleanly which I feel like is what's causing the issue I'm seeing with httpd. But I think the main problem here should be fixed by my PR.