telefonicaid / fiware-cygnus

A connector in charge of persisting context data sources into other third-party databases and storage systems, creating a historical view of the context
https://fiware-cygnus.rtfd.io/
GNU Affero General Public License v3.0
64 stars 105 forks source link

"Couldn't connect to server" - Docker ports issue. #1761

Closed jason-fox closed 4 years ago

jason-fox commented 4 years ago

I have been updating the Cygnus tutorial here: https://github.com/FIWARE/tutorials.Historic-Context-Flume

There is a problem when upgrading from Cygnus 1.13 to 1.17 - it looks like some ports have been inadvertently removed from the Docker configuration.

Orion is able to create a subscription to Cygnus successfully, but when checking the subscription:

curl -X GET \
  http://localhost:1026/v2/subscriptions/ \
  -H 'fiware-service: openiot' \
  -H 'fiware-servicepath: /'

a Couldn't connect to server error is received:

[
    {
        "id": "5dd50b1647cc2d528171ff03",
        "description": "Notify Cygnus of all context changes",
        "status": "failed",
        "subject": {
            "entities": [
                {
                    "idPattern": ".*"
                }
            ],
            "condition": {
                "attrs": []
            }
        },
        "notification": {
            "timesSent": 2,
            "lastNotification": "2019-11-20T09:45:42.00Z",
            "attrs": [],
            "onlyChangedAttrs": false,
            "attrsFormat": "legacy",
            "http": {
                "url": "http://cygnus:5050/notify"
            },
            "lastFailure": "2019-11-20T09:45:42.00Z",
            "lastFailureReason": "Couldn't connect to server"
        },
        "throttling": 5
    }
]

Cygnus 1.17 does not listen on the correct port so the data cannot be persisted. The correct behaviour can be seen using Cygnus 1.13

AlvaroVega commented 4 years ago

Could you provide config env vars used with cygnus docker ?

jason-fox commented 4 years ago

Sure: The following .env is working with Cygnus 1.13: https://github.com/FIWARE/tutorials.Historic-Context-Flume/blob/master/.env

# Cygnus Variables
CYGNUS_VERSION=1.13.0
CYGNUS_API_PORT=5080
CYGNUS_SERVICE_PORT=5050

The following does not (I've tried with 1.17.1 as well)

# Cygnus Variables
CYGNUS_VERSION=1.17.0
CYGNUS_API_PORT=5080
CYGNUS_SERVICE_PORT=5050

Just git clone the repo, alter the version number in the .env and run ./services hdfs (or whatever) - the issue can be seen in the context broker since the subscription is failing. I also run the check subscription cURL which can be seen in the README or the postman collection.

AlvaroVega commented 4 years ago

There was some issue about CYGNUS_SERVICE_PORT that was solve by this PR (merged after 1.17.0) https://github.com/telefonicaid/fiware-cygnus/pull/1755 Did you tried with latest version ?

jason-fox commented 4 years ago

I was only upgrading the Tutorials to align with the FIWARE_7.8 release - 1.17.0 was the candidate for the latest release - I have had no time to check other new versions or patch builds. I've also had limited internet to test.

Can you also confirm that the docker-compose:

I'll retry with .env 1.17.1 and report back.

fgalan commented 4 years ago

I'll retry with .env 1.17.1 and report back.

I think the fix that @AlvaroVega mentions is not included in 1.17.1. I'd suggest to test with latest from master (:latest tag in dockerhub).

jason-fox commented 4 years ago

1.17.1 not working. latest not working

{
    "success": "true",
    "version": "1.17.0_SNAPSHOT.f3f6602b9abadb5e54171471aba70d3caf4ca588"
}

Why 1.17.0 ????

Orion Log:

time=Monday 25 Nov 17:35:08 2019.855Z | lvl=INFO | corr=ecf7c858-0fa9-11ea-8da5-0242ac120105 | trans=1574703079-353-00000000081 | from=172.18.1.6 | srv=openiot | subsrv=/ | comp=Orion | op=connectionOperations.cpp[94]:collectionQuery | msg=Database Operation Successful (query: { $or: [ { entities.id: "Motion:003", $or: [ { entities.type: "Motion" }, { entities.type: { $exists: false } } ], entities.isPattern: "false", entities.isTypePattern: { $ne: true }, expiration: { $gt: 1574703308 }, status: { $ne: "inactive" } }, { entities.isPattern: "true", entities.isTypePattern: { $ne: true }, expiration: { $gt: 1574703308 }, status: { $ne: "inactive" }, $where: function(){for (var i=0; i < this.entities.length; i++) {if (this.enti... }, { entities.isPattern: "false", entities.isTypePattern: true, expiration: { $gt: 1574703308 }, status: { $ne: "inactive" }, $where: function(){for (var i=0; i < this.entities.length; i++) {if (this.enti... }, { entities.isPattern: "true", entities.isTypePattern: true, expiration: { $gt: 1574703308 }, status: { $ne: "inactive" }, $where: function(){for (var i=0; i < this.entities.length; i++) {if (this.enti... } ], servicePath: { $in: [ "/", "/#" ] } })
time=Monday 25 Nov 17:35:08 2019.858Z | lvl=INFO | corr=ecf7c858-0fa9-11ea-8da5-0242ac120105 | trans=1574703079-353-00000000081 | from=172.18.1.6 | srv=openiot | subsrv=/ | comp=Orion | op=connectionOperations.cpp[449]:collectionUpdate | msg=Database Operation Successful (update: <{ _id.id: "Motion:003", _id.type: "Motion", _id.servicePath: "/" }, { $set: { attrs.count: { value: "0", type: "Integer", md: { TimeInstant: { type: "DateTime", value: 1574703308.0 } }, mdNames: [ "TimeInstant" ], creDate: 1574703171, modDate: 1574703308 }, attrs.refStore: { value: "Store:003", type: "Relationship", md: { TimeInstant: { type: "DateTime", value: 1574703308.0 } }, mdNames: [ "TimeInstant" ], creDate: 1574703171, modDate: 1574703308 }, attrs.TimeInstant: { value: 1574703308.0, type: "DateTime", mdNames: [], creDate: 1574703171, modDate: 1574703308 }, modDate: 1574703308, lastCorrelator: "ecf7c858-0fa9-11ea-8da5-0242ac120105" }, $unset: { location: 1, expDate: 1 } }>)
time=Monday 25 Nov 17:35:08 2019.859Z | lvl=INFO | corr=ecf7c858-0fa9-11ea-8da5-0242ac120105 | trans=1574703079-353-00000000081 | from=172.18.1.6 | srv=openiot | subsrv=/ | comp=Orion | op=logMsg.h[1874]:lmTransactionEnd | msg=Transaction ended
time=Monday 25 Nov 17:35:10 2019.289Z | lvl=INFO | corr=edde5dea-0fa9-11ea-b275-0242ac120105 | trans=1574703079-353-00000000082 | from=172.18.1.6 | srv=openiot | subsrv=/ | comp=Orion | op=logMsg.h[1844]:lmTransactionStart | msg=Starting transaction from 172.18.1.6:60584/v2/entities/Motion:004/attrs
time=Monday 25 Nov 17:35:10 2019.289Z | lvl=INFO | corr=edde5dea-0fa9-11ea-b275-0242ac120105 | trans=1574703079-353-00000000082 | from=172.18.1.6 | srv=openiot | subsrv=/ | comp=Orion | op=rest.cpp[883]:servicePathSplit | msg=Service Path 0: '/'
time=Monday 25 Nov 17:35:10 2019.295Z | lvl=INFO | corr=edde5dea-0fa9-11ea-b275-0242ac120105 | trans=1574703079-353-00000000082 | from=172.18.1.6 | srv=openiot | subsrv=/ | comp=Orion | op=connectionOperations.cpp[239]:collectionCount | msg=Database Operation Successful (count: { _id.id: "Motion:004", _id.type: "Motion", _id.servicePath: "/" })
time=Monday 25 Nov 17:35:10 2019.297Z | lvl=INFO | corr=edde5dea-0fa9-11ea-b275-0242ac120105 | trans=1574703079-353-00000000082 | from=172.18.1.6 | srv=openiot | subsrv=/ | comp=Orion | op=connectionOperations.cpp[94]:collectionQuery | msg=Database Operation Successful (query: { _id.id: "Motion:004", _id.type: "Motion", _id.servicePath: "/" })
time=Monday 25 Nov 17:35:10 2019.314Z | lvl=INFO | corr=edde5dea-0fa9-11ea-b275-0242ac120105 | trans=1574703079-353-00000000082 | from=172.18.1.6 | srv=openiot | subsrv=/ | comp=Orion | op=connectionOperations.cpp[94]:collectionQuery | msg=Database Operation Successful (query: { $or: [ { entities.id: "Motion:004", $or: [ { entities.type: "Motion" }, { entities.type: { $exists: false } } ], entities.isPattern: "false", entities.isTypePattern: { $ne: true }, expiration: { $gt: 1574703310 }, status: { $ne: "inactive" } }, { entities.isPattern: "true", entities.isTypePattern: { $ne: true }, expiration: { $gt: 1574703310 }, status: { $ne: "inactive" }, $where: function(){for (var i=0; i < this.entities.length; i++) {if (this.enti... }, { entities.isPattern: "false", entities.isTypePattern: true, expiration: { $gt: 1574703310 }, status: { $ne: "inactive" }, $where: function(){for (var i=0; i < this.entities.length; i++) {if (this.enti... }, { entities.isPattern: "true", entities.isTypePattern: true, expiration: { $gt: 1574703310 }, status: { $ne: "inactive" }, $where: function(){for (var i=0; i < this.entities.length; i++) {if (this.enti... } ], servicePath: { $in: [ "/", "/#" ] } })
time=Monday 25 Nov 17:35:10 2019.318Z | lvl=INFO | corr=edde5dea-0fa9-11ea-b275-0242ac120105 | trans=1574703079-353-00000000082 | from=172.18.1.6 | srv=openiot | subsrv=/ | comp=Orion | op=connectionOperations.cpp[449]:collectionUpdate | msg=Database Operation Successful (update: <{ _id.id: "Motion:004", _id.type: "Motion", _id.servicePath: "/" }, { $set: { attrs.count: { value: "0", type: "Integer", md: { TimeInstant: { type: "DateTime", value: 1574703310.0 } }, mdNames: [ "TimeInstant" ], creDate: 1574703185, modDate: 1574703310 }, attrs.refStore: { value: "Store:004", type: "Relationship", md: { TimeInstant: { type: "DateTime", value: 1574703310.0 } }, mdNames: [ "TimeInstant" ], creDate: 1574703185, modDate: 1574703310 }, attrs.TimeInstant: { value: 1574703310.0, type: "DateTime", mdNames: [], creDate: 1574703185, modDate: 1574703310 }, modDate: 1574703310, lastCorrelator: "edde5dea-0fa9-11ea-b275-0242ac120105" }, $unset: { location: 1, expDate: 1 } }>)
time=Monday 25 Nov 17:35:10 2019.319Z | lvl=INFO | corr=edde5dea-0fa9-11ea-b275-0242ac120105 | trans=1574703079-353-00000000082 | from=172.18.1.6 | srv=openiot | subsrv=/ | comp=Orion | op=logMsg.h[1874]:lmTransactionEnd | msg=Transaction ended

Cygnus Log

Warning: JAVA_HOME is not set!
+ exec /usr/bin/java -Xms2048m -Xmx4096m -Dflume.root.logger=DEBUG,console -Duser.timezone=UTC -Dfile.encoding=UTF-8 -cp '/opt/apache-flume/conf:/opt/apache-flume/lib/*:/opt/apache-flume/plugins.d/cygnus/lib/*:/opt/apache-flume/plugins.d/cygnus/libext/*' -Djava.library.path= com.telefonica.iot.cygnus.nodes.CygnusApplication -f /opt/apache-flume/conf/agent.conf -n cygnus-ngsi -p 5080
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/apache-flume/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/apache-flume/plugins.d/cygnus/lib/cygnus-ngsi-1.17.0_SNAPSHOT-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/apache-flume/plugins.d/cygnus/libext/cygnus-common-1.17.0_SNAPSHOT-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
time=2019-11-25T17:31:36.301Z | lvl=INFO | corr=N/A | trans=N/A | srv=N/A | subsrv=N/A | comp=N/A | op=main | msg=com.telefonica.iot.cygnus.nodes.CygnusApplication[188] : Starting Cygnus, version 1.17.0_SNAPSHOT.f3f6602b9abadb5e54171471aba70d3caf4ca588
time=2019-11-25T17:31:36.530Z | lvl=DEBUG | corr=N/A | trans=N/A | srv=N/A | subsrv=N/A | comp=N/A | op=printLoadedJars | msg=com.telefonica.iot.cygnus.utils.CommonUtils[395] : Loading httpclient from file:/opt/apache-flume/lib/httpclient-4.5.3.jar
time=2019-11-25T17:31:36.593Z | lvl=DEBUG | corr=N/A | trans=N/A | srv=N/A | subsrv=N/A | comp=N/A | op=printLoadedJars | msg=com.telefonica.iot.cygnus.utils.CommonUtils[399] : Loading httpcore from file:/opt/apache-flume/lib/httpcore-4.4.6.jar
time=2019-11-25T17:31:36.616Z | lvl=DEBUG | corr=N/A | trans=N/A | srv=N/A | subsrv=N/A | comp=N/A | op=printLoadedJars | msg=com.telefonica.iot.cygnus.utils.CommonUtils[403] : Loading junit from file:/opt/apache-flume/plugins.d/cygnus/lib/cygnus-ngsi-1.17.0_SNAPSHOT-jar-with-dependencies.jar
time=2019-11-25T17:31:36.674Z | lvl=DEBUG | corr=N/A | trans=N/A | srv=N/A | subsrv=N/A | comp=N/A | op=printLoadedJars | msg=com.telefonica.iot.cygnus.utils.CommonUtils[408] : Loading flume-ng-node from file:/opt/apache-flume/lib/flume-ng-core-1.9.0.jar
2019-11-25T17:31:36.727178200Z time=2019-11-25T17:31:36.701Z | lvl=DEBUG | corr=N/A | trans=N/A | srv=N/A | subsrv=N/A | comp=N/A | op=printLoadedJars | msg=com.telefonica.iot.cygnus.utils.CommonUtils[412] : Loading libthrift from file:/opt/apache-flume/lib/parquet-hive-bundle-1.4.1.jar
time=2019-11-25T17:31:36.771Z | lvl=DEBUG | corr=N/A | trans=N/A | srv=N/A | subsrv=N/A | comp=N/A | op=printLoadedJars | msg=com.telefonica.iot.cygnus.utils.CommonUtils[416] : Loading gson from file:/opt/apache-flume/lib/gson-2.2.2.jar
time=2019-11-25T17:31:36.804Z | lvl=DEBUG | corr=N/A | trans=N/A | srv=N/A | subsrv=N/A | comp=N/A | op=printLoadedJars | msg=com.telefonica.iot.cygnus.utils.CommonUtils[420] : Loading json-simple from file:/opt/apache-flume/plugins.d/cygnus/lib/cygnus-ngsi-1.17.0_SNAPSHOT-jar-with-dependencies.jar
curl -X GET   http://localhost:1026/v2/subscriptions/   -H 'Accept: */*'   -H 'Accept-Encoding: gzip, deflate'   -H 'Connection: keep-alive'   -H 'Cookie: _csrf=MAPTGFPcoPnewsGCWklHi4Mq'   -H 'Host: localhost:1026'   -H 'User-Agent: PostmanRuntime/7.19.0'   -H 'fiware-service: openiot'   -H 'fiware-servicepath: /'
[
    {
        "id": "5ddc10558ca45e8a89647da6",
        "description": "Notify Cygnus of all context changes",
        "status": "failed",
        "subject": {
            "entities": [
                {
                    "idPattern": ".*"
                }
            ],
            "condition": {
                "attrs": []
            }
        },
        "notification": {
            "timesSent": 4,
            "lastNotification": "2019-11-25T17:34:57.00Z",
            "attrs": [],
            "onlyChangedAttrs": false,
            "attrsFormat": "legacy",
            "http": {
                "url": "http://cygnus:5050/notify"
            },
            "lastFailure": "2019-11-25T17:34:57.00Z",
            "lastFailureReason": "Couldn't connect to server"
        },
        "throttling": 5
    }
]
fgalan commented 4 years ago

(Semi-offtopic :)

latest not working

{
    "success": "true",
    "version": "1.17.0_SNAPSHOT.f3f6602b9abadb5e54171471aba70d3caf4ca588"
}

Why 1.17.0 ????

I guess you mean "why 1.17.0_SNAPHOT in latest instead of 1.17.1_SNAPSHOT as 1.17.1 has been the last released version".

This is due to the rule is "the last tag on master branch + SNAPSHOT". Note that 1.17.1 is not in master branch, but in release/1.17.0 branch.

fgalan commented 4 years ago

Maybe it would be useful to check which ports are opened by the Cygnus process inside the docker container (netstat -ntlp or something similar) and ensure the one are you trying to use in the subscription is opened.

Which sinks are you using? Note that if Cygnus runs in multi-agent mode (not sure in your case) there is a port associated to each sink, as detailed at https://github.com/telefonicaid/fiware-cygnus/blob/96025d3d235456ecf8bf86c6e909a23e4a8d438f/doc/cygnus-ngsi/installation_and_administration_guide/install_with_docker.md#using-a-specific-configuration

jason-fox commented 4 years ago

Using the HDFS sink.

netstat -ntlp is showing something very interesting:

1.13.0

[root@cygnus /]# netstat -ntpl
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:5080            0.0.0.0:*               LISTEN      23/java             
tcp        0      0 0.0.0.0:5050            0.0.0.0:*               LISTEN      23/java             
tcp        0      0 127.0.0.11:46817        0.0.0.0:*               LISTEN      -   

1.17.0

[root@cygnus /]# netstat -ntpl
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:5080            0.0.0.0:*               LISTEN      7/java              
tcp        0      0 127.0.0.11:38789        0.0.0.0:*               LISTEN      -     

latest

[root@cygnus /]# netstat -ntpl
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:5080            0.0.0.0:*               LISTEN      7/java              
tcp        0      0 127.0.0.11:44169        0.0.0.0:*               LISTEN      -      

Cygnus is no longer listening on 5050

jason-fox commented 4 years ago
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}" --filter name=fiware-*
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}" --filter name=hdfs-*
NAMES               STATUS                   PORTS
fiware-cygnus       Up 2 minutes (healthy)   0.0.0.0:5050->5050/tcp, 0.0.0.0:5080->5080/tcp
fiware-iot-agent    Up 3 minutes (healthy)   0.0.0.0:4041->4041/tcp, 0.0.0.0:7896->7896/tcp, 4061/tcp
fiware-orion        Up 3 minutes (healthy)   0.0.0.0:1026->1026/tcp
fiware-tutorial     Up 3 minutes (healthy)   0.0.0.0:3000-3001->3000-3001/tcp

NAMES                    STATUS              PORTS
hdfs-nodemanager         Up 3 minutes        8030-8031/tcp, 8050/tcp, 9000/tcp, 10020/tcp, 19888/tcp, 50010/tcp, 50020/tcp, 50070/tcp, 50075/tcp, 50090/tcp, 0.0.0.0:32769->8042/tcp
hdfs-yarn                Up 3 minutes        0.0.0.0:8042->8042/tcp, 0.0.0.0:8050->8050/tcp, 8030-8031/tcp, 0.0.0.0:8088->8088/tcp, 0.0.0.0:10020->10020/tcp, 9000/tcp, 50010/tcp, 50020/tcp, 50070/tcp, 50075/tcp, 50090/tcp, 0.0.0.0:19888->19888/tcp
hdfs-secondarynamenode   Up 3 minutes        8030-8031/tcp, 8050/tcp, 9000/tcp, 10020/tcp, 19888/tcp, 50010/tcp, 50020/tcp, 50070/tcp, 50075/tcp, 0.0.0.0:50090->50090/tcp
hdfs-datanode            Up 3 minutes        8030-8031/tcp, 8050/tcp, 9000/tcp, 10020/tcp, 19888/tcp, 50020/tcp, 50070/tcp, 50090/tcp, 0.0.0.0:50010->50010/tcp, 0.0.0.0:32768->50075/tcp
hdfs-namenode            Up 3 minutes        8030-8031/tcp, 8050/tcp, 10020/tcp, 19888/tcp, 50010/tcp, 0.0.0.0:9000->9000/tcp, 50020/tcp, 50075/tcp, 50090/tcp, 0.0.0.0:50070->50070/tcp

Same in all three cases - Docker is exposing the correct port mappings

AlvaroVega commented 4 years ago

@jason-fox if you only provide:

   - CYGNUS_VERSION=1.17.0
    - CYGNUS_API_PORT=5080
    - CYGNUS_SERVICE_PORT=5050

and not other CYGNUS env var related with sinks like CYGNUS_MYSQL_HOST, CYGNUS_MONGO_HOSTS, and so on, then cygnus is not enabling that sinks and then is not listening in related sink ports (5050, 5051, and so on).

jason-fox commented 4 years ago

@AlvaroVega - I'm using the following docker-compose.yml which works with 1.13.0, fails with 1.17.0

.env

# Cygnus Variables
CYGNUS_VERSION=latest | 1.17.0 | 1.13.0
CYGNUS_API_PORT=5080
CYGNUS_SERVICE_PORT=5050

docker-compose.yml

  # Cygnus is configured to write context data to HDFS
  cygnus:
    image: fiware/cygnus-ngsi:${CYGNUS_VERSION}
    hostname: cygnus
    container_name: fiware-cygnus
    depends_on:
      - namenode
      - datanode
      - nodemanager
      - yarn
    networks:
      - default
    expose:
      - "${CYGNUS_SERVICE_PORT}"
      - "${CYGNUS_API_PORT}"
    ports:
      - "${CYGNUS_SERVICE_PORT}:${CYGNUS_SERVICE_PORT}" # localhost:5050
      - "${CYGNUS_API_PORT}:${CYGNUS_API_PORT}" # localhost:5080
    environment:
      - "CYGNUS_HDFS_HOST=namenode"
      - "CYGNUS_HDFS_PORT=50070"
      - "CYGNUS_HDFS_USER=hdfs" # Warning: For development use only - Storing sensitive information in clear text is insecure
#     - "CYGNUS_HDFS_TOKEN="    # Warning: For development use only - Storing sensitive information in clear text is insecure
      - "CYGNUS_HDFS_SKIP_CONF_GENERATION=true"
      - "CYGNUS_HDFS_ENABLE_ENCODING=false"
      - "CYGNUS_HDFS_ENABLE_GROUPING=false"
      - "CYGNUS_HDFS_ENABLE_NAME_MAPPINGS=false"
      - "CYGNUS_HDFS_SKIP_NAME_MAPPINGS_GENERATION=false"
      - "CYGNUS_HDFS_ENABLE_LOWERCASE=false"
      - "CYGNUS_HDFS_DATA_MODEL=dm-by-entity"
      - "CYGNUS_HDFS_FILE_FORMAT=json-column"
      - "CYGNUS_HDFS_BACKEND_IMPL=rest"
      - "CYGNUS_HDFS_HIVE=false"
      - "CYGNUS_HDFS_KRB5_AUTH=false"

      - "CYGNUS_LOG_LEVEL=DEBUG" # The logging level for Cygnus
      - "CYGNUS_SERVICE_PORT=${CYGNUS_SERVICE_PORT}" # Notification Port that Cygnus listens when subcribing to context data changes
      - "CYGNUS_API_PORT=${CYGNUS_API_PORT}" # Port that Cygnus listens on for operational reasons
    healthcheck:
      test: curl --fail -s http://localhost:${CYGNUS_API_PORT}/v1/version || exit 1

If there are new ENVs which now need to be supplied, then this is a breaking change between 1.13 and 1.17

fgalan commented 4 years ago

By your configuration I understand you are using only one sink and that sink is HDFS. Is that correct?

Looking to Dockerfile at https://github.com/telefonicaid/fiware-cygnus/blob/master/docker/cygnus-ngsi/Dockerfile#L116 I see the following env vars related with that sink:

# HDFS options
ENV CYGNUS_HDFS_HOST ""
ENV CYGNUS_HDFS_PORT ""
# ENV CYGNUS_HDFS_USER "" # Warning: For development use only - Storing sensitive information in clear text is insecure
# ENV CYGNUS_HDFS_TOKEN "" # Warning: For development use only - Storing sensitive information in clear text is insecure
ENV CYGNUS_HDFS_SKIP_CONF_GENERATION ""
ENV CYGNUS_HDFS_SERVICE_PORT "5053"
ENV CYGNUS_HDFS_ENABLE_ENCODING ""
ENV CYGNUS_HDFS_ENABLE_GROUPING ""
ENV CYGNUS_HDFS_ENABLE_NAME_MAPPINGS ""
ENV CYGNUS_HDFS_SKIP_NAME_MAPPINGS_GENERATION ""
ENV CYGNUS_HDFS_ENABLE_LOWERCASE ""
ENV CYGNUS_HDFS_DATA_MODEL ""
ENV CYGNUS_HDFS_FILE_FORMAT ""
ENV CYGNUS_HDFS_BACKEND_IMPL ""
ENV CYGNUS_HDFS_BACKEND_MAX_CONNS ""
ENV CYGNUS_HDFS_BACKEND_MAX_CONNS_PER_ROUTE ""
ENV CYGNUS_HDFS_PASSWORD ""
ENV CYGNUS_HDFS_SERVICE_AS_NAMESPACE ""
ENV CYGNUS_HDFS_BATCH_SIZE ""
ENV CYGNUS_HDFS_BATCH_TIMEOUT ""
ENV CYGNUS_HDFS_BATCH_TTL ""
ENV CYGNUS_HDFS_BATCH_RETRY_INTERVALS ""
ENV CYGNUS_HDFS_HIVE ""
ENV CYGNUS_HDFS_KRB5_AUTH ""

Maybe it's a matter of using CYGNUS_HDFS_SERVICE_PORT=5053 instead of CYGNUS_SERVICE_PORT=5050 in the .env file.

AlvaroVega commented 4 years ago

According with doc https://github.com/telefonicaid/fiware-cygnus/blob/master/doc/cygnus-ngsi/installation_and_administration_guide/install_with_docker.md#using-a-specific-configuration

CYGNUS_HDFS_SKIP_CONF_GENERATION: true skips the generation of the conf files, typically this files will be got from a volume, false autogenerate the conf files from the rest of environment variables.

In your case, if you use

   - "CYGNUS_HDFS_SKIP_CONF_GENERATION=true"

then agent.conf is generated without HDFS section

AlvaroVega commented 4 years ago

By the other side, CYGNUS_SERVICE_PORT has no sense since current cygnus docker always runs in multisink mode (multiagent or not), that is each sink has a specific port. If you need run cygnus in a non multisink mode, then you should provide your full cygnus agent.conf and use CYGNUS_SKIP_CONF_GENERATION=true

jason-fox commented 4 years ago

By the other side, CYGNUS_SERVICE_PORT has no sense since current cygnus docker always runs in multisink mode (multiagent or not), that is each sink has a specific port. If you need run cygnus in a non multisink mode, then you should provide your full cygnus agent.conf and use CYGNUS_SKIP_CONF_GENERATION=true

Bingo. I think you have discovered the reason behind the change of behaviour.

My guess is that the default flume configuration within Docker used to be a Multiagent *.conf, so it was possible to use CYGNUS_SERVICE_PORT=5050 in all cases and then configure the second step in the pipeline, since 5050 was passing to every (configured) sink in turn.

At some point between 1.13 and 1.17 this was changed to a single agent, so that now it is essential for CYGNUS_SERVICE_PORT=CONFIGURED_SINK_SERVICE_PORT

This means that all the Cygnus configurations need to change such as: