Closed dtitov closed 6 years ago
A small comment while I was checking the picture, that came to my mind.
The definitions to connect CentralEGA and LocalEGA don't seem to be loaded.
Therefore, it of course fails to retrieve message from CentralEGA for files to ingest.
That bit is obvious.
The comment is that I can see the user is guest
, but if I recall correctly, we are passed that and a user/password pair is generated, even for the local broker.
I don't believe it is the problem, because a fresh deployment does bootstrap properly, but maybe it's worth looking into that. That might give another idea to solve "Why are the definitions failing to load?".
I wasn't able to reproduce the problem. I created a Centos VM to cPoutan, did git clone, make images and docker-compose up and:
[cloud-user@86 docker]$ cat /etc/redhat-release
CentOS Linux release 7.4.1708 (Core)
[cloud-user@86 docker]$ docker-compose ps
Name Command State Ports
--------------------------------------------------------------------------------------------------------------------------------------
cega-mq docker-entrypoint.sh rabbi ... Up 15671/tcp, 0.0.0.0:15670->15672/tcp, 25672/tcp, 4369/tcp, 5671/tcp,
cega-users /cega/server.py Up 0.0.0.0:9100->80/tcp
ega-db-fin1 docker-entrypoint.sh postgres Up 5432/tcp
ega-db-swe1 docker-entrypoint.sh postgres Up 5432/tcp
ega-elasticsearch-fin1 /usr/local/bin/docker-entr ... Up 9200/tcp, 9300/tcp
ega-elasticsearch-swe1 /usr/local/bin/docker-entr ... Up 9200/tcp, 9300/tcp
ega-frontend-fin1 ega-frontend Up 0.0.0.0:9001->80/tcp
ega-frontend-swe1 ega-frontend Up 0.0.0.0:9000->80/tcp
ega-inbox-fin1 entrypoint.sh Up 0.0.0.0:2223->22/tcp
ega-inbox-swe1 entrypoint.sh Up 0.0.0.0:2222->22/tcp
ega-keys-fin1 entrypoint.sh Up 9010/tcp, 9011/tcp
ega-keys-swe1 entrypoint.sh Up 9010/tcp, 9011/tcp
ega-kibana-fin1 /bin/bash /usr/local/bin/k ... Up 0.0.0.0:5602->5601/tcp
ega-kibana-swe1 /bin/bash /usr/local/bin/k ... Up 0.0.0.0:5601->5601/tcp
ega-logstash-fin1 /usr/local/bin/docker-entr ... Up 5044/tcp, 9600/tcp
ega-logstash-swe1 /usr/local/bin/docker-entr ... Up 5044/tcp, 9600/tcp
ega-mq-fin1 /usr/bin/ega-entrypoint.sh ... Up 15671/tcp, 0.0.0.0:15673->15672/tcp, 25672/tcp, 4369/tcp, 5671/tcp,
ega-mq-swe1 /usr/bin/ega-entrypoint.sh ... Up 15671/tcp, 0.0.0.0:15672->15672/tcp, 25672/tcp, 4369/tcp, 5671/tcp,
ega-vault-fin1 entrypoint.sh Up
ega-vault-swe1 entrypoint.sh Up
ega_ingest-fin1_1 entrypoint.sh Up
ega_ingest-swe1_1 entrypoint.sh Up
Can you @dtitov provide access to a VM where you have this problem?
We noticed with @silverdaz that CEGA definitions are not loaded to ega-mq-* containers even these containers are up:
ega-mq-swe1 | 2018-01-05 10:39:29.919 [error] <0.710.0> CRASH REPORT Process <0.710.0> with 0 neighbours crashed with reason: bad arg
in call to lists:keyfind(<<"src-protocol">>, 1, #{<<"ack-mode">> => <<"on-confirm">>,<<"add-forward-headers">> => false,<<"delete-after">> =>
.">>,...}) in rabbit_shovel_parameters:protocols/1 line 439
ega-mq-swe1 | 2018-01-05 10:39:29.919 [error] <0.709.0> Ranch listener rabbit_web_dispatch_sup_15672, connection process <0.709.0>, s
1 had its request process <0.710.0> exit with reason badarg and stacktrace [{lists,keyfind,[<<"src-protocol">>,1,#{<<"ack-mode">> => <<"on-con
>>,<<"add-forward-headers">> => false,<<"delete-after">> => <<"never">>,<<"dest-exchange">> => <<"localega.v1">>,<<"dest-exchange-key">> => <<"
s">>,<<"dest-uri">> => <<"amqp://cega_swe1:LwAVtMZzWaXSRWdL@cega-mq:5672/swe1">>,<<"src-exchange">> => <<"lega">>,<<"src-exchange-key">> => <<"
error.user">>,<<"src-uri">> => <<"amqp://">>}],[]},{rabbit_shovel_parameters,protocols,1,[{file,"src/rabbit_shovel_parameters.erl"},{line,439}]
bbit_shovel_parameters,src_validation,2,[{file,"src/rabbit_shovel_parameters.erl"},{line,111}]},{rabbit_shovel_parameters,validate,5,[{file,"sr
bit_shovel_parameters.erl"},{line,44}]},{rabbit_runtime_parameters,set_any0,5,[{file,"src/rabbit_runtime_parameters.erl"},{line,152}]},{rabbit_
me_parameters,set_any,5,[{file,"src/rabbit_runtime_parameters.erl"},{line,143}]},{rabbit_mgmt_wm_definitions,add_parameter,2,[{file,"src/rabbit
_wm_definitions.erl"},{line,338}]},{rabbit_mgmt_wm_definitions,'-for_all/4-lc$^0/1-0-',3,[{file,"src/rabbit_mgmt_wm_definitions.erl"},{line,316
ega-mq-swe1 | Central EGA connections loaded
However the same setup seems to work when we downgraded RabbitMQ to version 3.6. Current RabbitMQ version is 3.7 so perhaps there is a change in configuration options (or format).
@silverdaz hard coded RabbitMQ version to 3.6 and everything works as a charm. I propose that we stick on that and update when we reason to.
Great!
Resolved by PR #229
Floating bug: on some platforms (I got it on Ubuntu and CentOS) Docker deployment is not stable, meaning
mq_swe1
container doesn't start all the internal queues up. This, obsiously, fails the ingestion.Correct RabbitMQ:
Failing RabbitMQ:
Log with the error:
I can provide a CentOS instance in the cloud where this bug is reproducible constantly in case someone can't reproduce it.