SignalK / signalk-java

Installer and web-app for the artemis-server (a java based Signal K master server)
Apache License 2.0
6 stars 6 forks source link

Docker build shows influxDB configuration error. #18

Closed yyvus closed 5 years ago

yyvus commented 5 years ago

Describe the issue I have put in place a basic docker image for the signalk-java server. This one seems to work although I don't had yet the opportunity to test with a connect NMEA network (that will soon be done). I could only connect and simulate the good behivour with nc localhost 55555 questionning the tcp server.

The Dockerfile looks like so:

FROM ubuntu:18.04

RUN apt-get update && apt-get upgrade --yes && apt-get --yes install \
    apt-utils \
    less \
    net-tools \
    emacs \
    openjdk-11-jdk \
    maven \
    git \
    curl \
    influxdb \
    gnupg

RUN mkdir /etc/signalkJavaServer
COPY docker/signalK/signalk-java /etc/signalkJavaServer
COPY docker/signalK/signalk_entrypoint.sh /signalk_entrypoint.sh

ENTRYPOINT ["/signalk_entrypoint.sh"]
CMD ["/bin/bash"]

First COPY command does a copy of a local clone of the jdk11 branch stored in the folder docker/signalK/signalk-java. The second COPY command copies the signalk_entrypoint.sh, which is executed each time the contiainer runs. Below a reproduction of the entrypoint:

#!/bin/bash
set -e

export JAVA_HOME=$(dirname $(dirname $(readlink -f $(which javac))))

# install influxdb
curl -sL https://repos.influxdata.com/influxdb.key | apt-key add -
source /etc/lsb-release
echo "deb https://repos.influxdata.com/${DISTRIB_ID,,} ${DISTRIB_CODENAME} stable" | tee /etc/apt/sources.list.d/influxdb.list

# start influxdb server
service influxdb start

# start signalk
cd /etc/signalkJavaServer
mvn exec:exec

Building and running to build the container run: sudo docker build --tag DIRNAME:signalk -f docker/signalK/Dockerfile . to run the container run: sudo docker run -it --rm --network host --name NAME_signalk DIRNAME:signalk

Change the capital words with what matches best for you. Beware on the location where you run the command.

Error It seems that each time that influxdb is called we have this error showing up:

...
08:12:47.817 WARN  o.a.a.a.c.server - AMQ222166: No Expiry Address configured for queue internal.kv.2d5830b4-4d89-4b8f-afd2-819009e4db88 in AddressSettings
08:12:47.900 WARN  o.a.a.a.c.server - AMQ222165: No Dead Letter Address configured for queue internal.kv.6bd82bf0-f6d6-455e-baf2-b85c2f115229 in AddressSettings
08:12:47.901 WARN  o.a.a.a.c.server - AMQ222166: No Expiry Address configured for queue internal.kv.6bd82bf0-f6d6-455e-baf2-b85c2f115229 in AddressSettings
08:12:47.922 ERROR n.c.f.s.a.h.AnchorWatchHandler - org.influxdb.InfluxDBException: error parsing query: found /, expected regex at line 1, char 35
org.influxdb.InfluxDBException: error parsing query: found /, expected regex at line 1, char 35
    at org.influxdb.InfluxDBException.buildExceptionFromErrorMessage(InfluxDBException.java:161) ~[influxdb-java-2.15.jar:?]
    at org.influxdb.InfluxDBException.buildExceptionForErrorState(InfluxDBException.java:173) ~[influxdb-java-2.15.jar:?]
    at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:808) ~[influxdb-java-2.15.jar:?]
    at org.influxdb.impl.InfluxDBImpl.executeQuery(InfluxDBImpl.java:795) ~[influxdb-java-2.15.jar:?]
    at org.influxdb.impl.InfluxDBImpl.query(InfluxDBImpl.java:552) ~[influxdb-java-2.15.jar:?]
    at nz.co.fortytwo.signalk.artemis.tdb.InfluxDbService.loadData(InfluxDbService.java:306) ~[artemis-server-jdk11-7cdd97687d-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.tdb.InfluxDbService.loadData(InfluxDbService.java:179) ~[artemis-server-jdk11-7cdd97687d-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.handler.AnchorWatchHandler.initFromDb(AnchorWatchHandler.java:97) ~[artemis-server-jdk11-7cdd97687d-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.handler.AnchorWatchHandler.<init>(AnchorWatchHandler.java:79) [artemis-server-jdk11-7cdd97687d-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.startKvHandlers(ArtemisServer.java:376) [artemis-server-jdk11-7cdd97687d-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.init(ArtemisServer.java:155) [artemis-server-jdk11-7cdd97687d-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.<init>(ArtemisServer.java:131) [artemis-server-jdk11-7cdd97687d-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.main(ArtemisServer.java:546) [artemis-server-jdk11-7cdd97687d-1.jar:?]
08:12:47.944 WARN  o.a.a.a.c.server - AMQ222165: No Dead Letter Address configured for queue internal.kv.b24b75d6-689f-4b84-845d-4a786f9c764e in AddressSettings
08:12:47.944 WARN  o.a.a.a.c.server - AMQ222166: No Expiry Address configured for queue internal.kv.b24b75d6-689f-4b84-845d-4a786f9c764e in AddressSettings
08:12:47.986 ERROR n.c.f.s.a.h.AlarmHandler - org.influxdb.InfluxDBException: error parsing query: found /, expected regex at line 1, char 35
org.influxdb.InfluxDBException: error parsing query: found /, expected regex at line 1, char 35
    at org.influxdb.InfluxDBException.buildExceptionFromErrorMessage(InfluxDBException.java:161) ~[influxdb-java-2.15.jar:?]
    at org.influxdb.InfluxDBException.buildExceptionForErrorState(InfluxDBException.java:173) ~[influxdb-java-2.15.jar:?]
    at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:808) ~[influxdb-java-2.15.jar:?]
    at org.influxdb.impl.InfluxDBImpl.executeQuery(InfluxDBImpl.java:795) ~[influxdb-java-2.15.jar:?]
    at org.influxdb.impl.InfluxDBImpl.query(InfluxDBImpl.java:552) ~[influxdb-java-2.15.jar:?]
    at nz.co.fortytwo.signalk.artemis.tdb.InfluxDbService.loadData(InfluxDbService.java:306) ~[artemis-server-jdk11-7cdd97687d-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.tdb.InfluxDbService.loadData(InfluxDbService.java:179) ~[artemis-server-jdk11-7cdd97687d-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.handler.AlarmHandler.loadAlarms(AlarmHandler.java:98) ~[artemis-server-jdk11-7cdd97687d-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.handler.AlarmHandler.<init>(AlarmHandler.java:74) [artemis-server-jdk11-7cdd97687d-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.startKvHandlers(ArtemisServer.java:378) [artemis-server-jdk11-7cdd97687d-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.init(ArtemisServer.java:155) [artemis-server-jdk11-7cdd97687d-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.<init>(ArtemisServer.java:131) [artemis-server-jdk11-7cdd97687d-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.main(ArtemisServer.java:546) [artemis-server-jdk11-7cdd97687d-1.jar:?]
08:12:48.009 WARN  o.a.a.a.c.server - AMQ222165: No Dead Letter Address configured for queue internal.kv.d5003fff-7cfe-41e3-9ca8-5f24a43768a0 in AddressSettings
08:12:48.010 WARN  o.a.a.a.c.server - AMQ222166: No Expiry Address configured for queue internal.kv.d5003fff-7cfe-41e3-9ca8-5f24a43768a0 in AddressSettings
...

The console log finishes with:

08:12:54.218 INFO  o.a.c.AtmosphereFramework - Atmosphere is using org.atmosphere.inject.InjectableObjectFactory for dependency injection and object creation
08:12:54.218 INFO  o.a.c.AtmosphereFramework - Atmosphere is using async support: org.atmosphere.container.Servlet30CometSupport running under container: Nettosphere/3.0 using javax.servlet/3.0
08:12:54.219 INFO  o.a.c.AtmosphereFramework - Atmosphere Framework 2.5.2 started.
08:12:54.219 INFO  o.a.c.AtmosphereFramework - 

    For Atmosphere Framework Commercial Support, visit 
    http://www.async-io.org/ or send an email to support@async-io.org

08:12:54.220 INFO  o.a.c.AtmosphereFramework - Installed AtmosphereInterceptor nz.co.fortytwo.signalk.artemis.atmosphere.intercept.AuthenticationInterceptor with priority AFTER_DEFAULT 
08:12:54.244 INFO  n.c.f.s.a.s.ArtemisServer - Starting tcp/udp server
08:12:54.417 INFO  n.c.f.s.a.s.ArtemisServer - Starting NMEA output
08:12:54.435 INFO  n.c.f.s.a.s.ArtemisServer - Starting scheduled services
08:12:54.436 INFO  n.c.f.s.a.s.ArtemisServer - Starting Zeroconf discovery agent
08:12:54.439 INFO  n.c.f.s.a.s.ArtemisServer - Starting serial port manager
08:12:54.528 INFO  n.c.f.s.a.s.ChartService - Reload charts at startup
Exception in thread "main" org.influxdb.InfluxDBException: error parsing query: found /, expected regex at line 1, char 38
    at org.influxdb.InfluxDBException.buildExceptionFromErrorMessage(InfluxDBException.java:161)
    at org.influxdb.InfluxDBException.buildExceptionForErrorState(InfluxDBException.java:173)
    at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:808)
    at org.influxdb.impl.InfluxDBImpl.executeQuery(InfluxDBImpl.java:795)
    at org.influxdb.impl.InfluxDBImpl.query(InfluxDBImpl.java:552)
    at nz.co.fortytwo.signalk.artemis.tdb.InfluxDbService.loadResources(InfluxDbService.java:202)
    at nz.co.fortytwo.signalk.artemis.tdb.InfluxDbService.loadResources(InfluxDbService.java:132)
    at nz.co.fortytwo.signalk.artemis.service.ChartService.getCharts(ChartService.java:182)
    at nz.co.fortytwo.signalk.artemis.service.ChartService.reloadCharts(ChartService.java:131)
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.init(ArtemisServer.java:226)
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.<init>(ArtemisServer.java:131)
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.main(ArtemisServer.java:546)
yyvus commented 5 years ago

As a side comment, this way to execute the server mvn exec:exec is not optimal because each time we run the container maven downloads all the dependencies. This takes too long...

rob42 commented 5 years ago

The normal reason for using mvn exec:exec is to install or upgrade, where this behaviour is ideal. In use on Rpi it uses extra params to only use the local repo cache in ~/.m2. I think its -o, will check tomorrow

rob42 commented 5 years ago

Ive not used docker, how does it handle local caches and app config params across reboots?

krital commented 5 years ago

Typically you would expose paths you need to survive reboots in some type of volume mount...

yyvus commented 5 years ago

You are making a point: I could copy the dependencies from the ~/.m2 folder from local installation to the container installation. Maven will then be able to find the dependencies in the container... But since you have put in place a maven plugin to handle the execution, we could also run the jar created by an other plugin. This jar should include all dependencies and would then be easy to launch with java -jar jarfile.jar. I have done that in the past with

            <plugin>
                <artifactId>maven-assembly-plugin</artifactId>
                <configuration>
                    <archive>
                        <manifest>
                            <mainClass>package....MainService</mainClass>
                        </manifest>
                    </archive>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                    <finalName>server</finalName>
                </configuration>
                <executions>
                    <execution>
                        <id>make-assembly</id> <!-- this is used for inheritance merges -->
                        <phase>package</phase> <!-- bind to the packaging phase -->
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
yyvus commented 5 years ago

Either you mount a local folder onto the docker or you do copies with the COPY command. In a deployment context you would wish to copy, since the local mounted folder is not accessible on the deploy machine.

rob42 commented 5 years ago

One of the requirements of this project is to make installs and updates easy for non-techs. Thats why the java code is in artemis-server, and ui etc in here. The artemis jar is built on the fly by jitpack.io and retrieved by maven.

Jitpack.io could build an all in one jar, but docker would still need to pull it down from jitpack unless its cached locally. Only way around that is to commit it in git, which blows the repo size out.

You can specify a cache location in maven, and I think you will need local storage for configs and the mapcache anyway, or they will dissappear between reboots. I think the best solution is to create a local mount, and store maven and other data in there.

Influxdb will need local storage too, or it will loose all history on reboot, so maybe a common root ./signalk/ with ./signalk/signalk-java and ./signalk/influxdb ?

rob42 commented 5 years ago

BTW the artemis-server does a checkInstall(), to make sure all required dirs and files exist at boot. No problem adding to that.

rob42 commented 5 years ago

The influx error is strange, seems to be a bad query string, but I dont see how. Can you turn nz.co.fortytwo.signalk.artemis.tdb.InfluxDbService to DEBUG pls and rerun? That will put out the offending query string, lets see what it looks like.

yyvus commented 5 years ago

Yes a common root would be good. That would allow to mount the common root only, but clarity should be the first concern since mounting a volume is very simple.

yyvus commented 5 years ago

I added the DEBUG level for the InfluxDbService as suggested and that is what I receive:

11:23:08.195 DEBUG n.c.f.s.a.t.InfluxDbService - queryStr: select * from vessels where skey=~/[n][a][v][i][g][a][t][i][o][n][.][p][o][s][i][t][i][o][n]/ and uuid=~/[u][r][n][:][m][r][n][:][s][i][g][n][a][l][k][:][u][u][i][d][:][d][d][7][b][6][9][8][1][-][b][1][e][b][-][4][a][0][9][-][a][4][f][a][-][c][9][3][1][9][8][c][4][1][5][f][5]/ group by skey,primary, uuid,sourceRef order by time desc limit 1
11:23:08.247 INFO  n.c.f.s.a.s.ChartService - Reload charts at startup
11:23:08.247 DEBUG n.c.f.s.a.t.InfluxDbService - Query: select * from resources  where skey=~/charts/ group by skey, primary, uuid, sourceRef order by time desc limit 1
Exception in thread "main" org.influxdb.InfluxDBException: error parsing query: found /, expected regex at line 1, char 38
    at org.influxdb.InfluxDBException.buildExceptionFromErrorMessage(InfluxDBException.java:161)
    at org.influxdb.InfluxDBException.buildExceptionForErrorState(InfluxDBException.java:173)
    at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:808)
    at org.influxdb.impl.InfluxDBImpl.executeQuery(InfluxDBImpl.java:795)
    at org.influxdb.impl.InfluxDBImpl.query(InfluxDBImpl.java:552)
    at nz.co.fortytwo.signalk.artemis.tdb.InfluxDbService.loadResources(InfluxDbService.java:202)
    at nz.co.fortytwo.signalk.artemis.tdb.InfluxDbService.loadResources(InfluxDbService.java:132)
    at nz.co.fortytwo.signalk.artemis.service.ChartService.getCharts(ChartService.java:182)
    at nz.co.fortytwo.signalk.artemis.service.ChartService.reloadCharts(ChartService.java:131)
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.init(ArtemisServer.java:226)
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.<init>(ArtemisServer.java:131)
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.main(ArtemisServer.java:546)
rob42 commented 5 years ago

Ive tried the query select * from resources where skey=~/charts/ group by skey, primary, uuid, sourceRef order by time desc limit 1 on my influxdb instance and it works as expected. Char 38 is /, which is as it should be. Pls try against your influxdb. log in to the instance using the client influx, then

use signalk<CR>
select * from resources  where skey=~/charts/ group by skey, primary, uuid, sourceRef order by time desc limit 1<CR>

Output should be

 > use signalk
 Using database signalk
 > select * from resources  where skey=~/charts/ group by skey, primary, uuid, sourceRef order by time    desc limit 1
 > 

If that works on your docker instance then it must be in the java db driver. It seems that the query isnt regexed like the vessel query above, I'll fix that now

rob42 commented 5 years ago

There was an error in the resources.charts query. It didnt show up as I had old entries in my db. Its fixed now. @RBerliner this wll fix your charts issue. It may help with the db error, but I doubt it as thats more about query parsing, not the query itself.

FYI the correct query is: select * from resources where uuid=~/charts/ group by skey, primary, uuid, sourceRef order by time desc limit 1

yyvus commented 5 years ago

Hi,

I just cloned the latest master branch and rerun the server. We can see below that the query has changed, but the error remains:

07:43:40.867 INFO  o.a.c.AtmosphereFramework - Installed AtmosphereInterceptor nz.co.fortytwo.signalk.artemis.atmosphere.intercept.AuthenticationInterceptor with priority AFTER_DEFAULT 
07:43:40.907 INFO  n.c.f.s.a.s.ArtemisServer - Starting tcp/udp server
07:43:41.005 INFO  n.c.f.s.a.s.ArtemisServer - Starting NMEA output
07:43:41.020 INFO  n.c.f.s.a.s.ArtemisServer - Starting scheduled services
07:43:41.020 INFO  n.c.f.s.a.s.ArtemisServer - Starting Zeroconf discovery agent
07:43:41.023 INFO  n.c.f.s.a.s.ArtemisServer - Starting serial port manager
07:43:41.056 DEBUG n.c.f.s.a.t.InfluxDbService - queryStr: select * from vessels where skey=~/[n][a][v][i][g][a][t][i][o][n][.][p][o][s][i][t][i][o][n]/ and uuid=~/[u][r][n][:][m][r][n][:][s][i][g][n][a][l][k][:][u][u][i][d][:][7][8][5][e][7][e][e][a][-][c][a][e][7][-][4][f][a][0][-][a][e][d][e][-][5][5][9][7][6][7][6][9][b][b][0][c]/ group by skey,primary, uuid,sourceRef order by time desc limit 1
07:43:41.142 INFO  n.c.f.s.a.s.ChartService - Reload charts at startup
07:43:41.142 DEBUG n.c.f.s.a.t.InfluxDbService - Query: select * from resources  where uuid=~/charts/ group by skey, primary, uuid, sourceRef order by time desc limit 1
Exception in thread "main" org.influxdb.InfluxDBException: error parsing query: found /, expected regex at line 1, char 38
    at org.influxdb.InfluxDBException.buildExceptionFromErrorMessage(InfluxDBException.java:161)
    at org.influxdb.InfluxDBException.buildExceptionForErrorState(InfluxDBException.java:173)
    at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:808)
    at org.influxdb.impl.InfluxDBImpl.executeQuery(InfluxDBImpl.java:795)
    at org.influxdb.impl.InfluxDBImpl.query(InfluxDBImpl.java:552)
    at nz.co.fortytwo.signalk.artemis.tdb.InfluxDbService.loadResources(InfluxDbService.java:202)
    at nz.co.fortytwo.signalk.artemis.tdb.InfluxDbService.loadResources(InfluxDbService.java:132)
    at nz.co.fortytwo.signalk.artemis.service.ChartService.getCharts(ChartService.java:183)
    at nz.co.fortytwo.signalk.artemis.service.ChartService.reloadCharts(ChartService.java:132)
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.init(ArtemisServer.java:232)
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.<init>(ArtemisServer.java:136)
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.main(ArtemisServer.java:576)

And by scrolling up I saw others that I list below:

07:43:36.077 DEBUG n.c.f.s.a.t.InfluxDbService - queryStr: select * from vessels where skey=~/meta/ group by skey,primary, uuid,sourceRef order by time desc limit 1
07:43:36.092 ERROR n.c.f.s.a.h.AlarmHandler - org.influxdb.InfluxDBException: error parsing query: found /, expected regex at line 1, char 35
org.influxdb.InfluxDBException: error parsing query: found /, expected regex at line 1, char 35
    at org.influxdb.InfluxDBException.buildExceptionFromErrorMessage(InfluxDBException.java:161) ~[influxdb-java-2.15.jar:?]
    at org.influxdb.InfluxDBException.buildExceptionForErrorState(InfluxDBException.java:173) ~[influxdb-java-2.15.jar:?]
    at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:808) ~[influxdb-java-2.15.jar:?]
    at org.influxdb.impl.InfluxDBImpl.executeQuery(InfluxDBImpl.java:795) ~[influxdb-java-2.15.jar:?]
    at org.influxdb.impl.InfluxDBImpl.query(InfluxDBImpl.java:552) ~[influxdb-java-2.15.jar:?]
    at nz.co.fortytwo.signalk.artemis.tdb.InfluxDbService.loadData(InfluxDbService.java:307) ~[artemis-server-jdk11-4e4d3969c9-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.tdb.InfluxDbService.loadData(InfluxDbService.java:179) ~[artemis-server-jdk11-4e4d3969c9-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.handler.AlarmHandler.loadAlarms(AlarmHandler.java:98) ~[artemis-server-jdk11-4e4d3969c9-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.handler.AlarmHandler.<init>(AlarmHandler.java:74) [artemis-server-jdk11-4e4d3969c9-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.startKvHandlers(ArtemisServer.java:408) [artemis-server-jdk11-4e4d3969c9-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.init(ArtemisServer.java:160) [artemis-server-jdk11-4e4d3969c9-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.<init>(ArtemisServer.java:136) [artemis-server-jdk11-4e4d3969c9-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.main(ArtemisServer.java:576) [artemis-server-jdk11-4e4d3969c9-1.jar:?]
07:43:36.029 DEBUG n.c.f.s.a.t.InfluxDbService - queryStr: select * from vessels where skey=~/navigation.anchor/ group by skey,primary, uuid,sourceRef order by time desc limit 1
07:43:36.039 ERROR n.c.f.s.a.h.AnchorWatchHandler - org.influxdb.InfluxDBException: error parsing query: found /, expected regex at line 1, char 35
org.influxdb.InfluxDBException: error parsing query: found /, expected regex at line 1, char 35
    at org.influxdb.InfluxDBException.buildExceptionFromErrorMessage(InfluxDBException.java:161) ~[influxdb-java-2.15.jar:?]
    at org.influxdb.InfluxDBException.buildExceptionForErrorState(InfluxDBException.java:173) ~[influxdb-java-2.15.jar:?]
    at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:808) ~[influxdb-java-2.15.jar:?]
    at org.influxdb.impl.InfluxDBImpl.executeQuery(InfluxDBImpl.java:795) ~[influxdb-java-2.15.jar:?]
    at org.influxdb.impl.InfluxDBImpl.query(InfluxDBImpl.java:552) ~[influxdb-java-2.15.jar:?]
    at nz.co.fortytwo.signalk.artemis.tdb.InfluxDbService.loadData(InfluxDbService.java:307) ~[artemis-server-jdk11-4e4d3969c9-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.tdb.InfluxDbService.loadData(InfluxDbService.java:179) ~[artemis-server-jdk11-4e4d3969c9-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.handler.AnchorWatchHandler.initFromDb(AnchorWatchHandler.java:97) ~[artemis-server-jdk11-4e4d3969c9-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.handler.AnchorWatchHandler.<init>(AnchorWatchHandler.java:79) [artemis-server-jdk11-4e4d3969c9-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.startKvHandlers(ArtemisServer.java:406) [artemis-server-jdk11-4e4d3969c9-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.init(ArtemisServer.java:160) [artemis-server-jdk11-4e4d3969c9-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.<init>(ArtemisServer.java:136) [artemis-server-jdk11-4e4d3969c9-1.jar:?]
    at nz.co.fortytwo.signalk.artemis.server.ArtemisServer.main(ArtemisServer.java:576) [artemis-server-jdk11-4e4d3969c9-1.jar:?]

Let me know if I can help.

rob42 commented 5 years ago

That error is really weird. If I copy the query and execute in the influxdb client, It seems to work fine, I get no error here. Even if I edit for a non-existing table it doesnt error. Plus the same code runs ok here.

Can you pls try logging into the docker container and running the queries from the command-line (in influx client). You will need to run the following:

$ influx
> use signalk
>select * from vessels where skey=~/[n][a][v][i][g][a][t][i][o][n][.][p][o][s][i][t][i][o][n]/ and uuid=~/[u][r][n][:][m][r][n][:][s][i][g][n][a][l][k][:][u][u][i][d][:][7][8][5][e][7][e][e][a][-][c][a][e][7][-][4][f][a][0][-][a][e][d][e][-][5][5][9][7][6][7][6][9][b][b][0][c]/ group by skey,primary, uuid,sourceRef order by time desc limit 1
>

Also may be a char set problem, check everything uses utf-8

rob42 commented 5 years ago

See https://www.i18nqa.com/debug/utf8-debug.html. If you send (or receive ) in the wrong charset the the ~ is translated badly. That would explain the error message error parsing query: found /, expected regex at line 1, char 35, which is a ~.

yyvus commented 5 years ago

The error reproduces inside the influx terminal:

influx
Connected to http://localhost:8086 version 1.1.1
InfluxDB shell version: 1.7.5
Enter an InfluxQL query
> show databases
name: databases
name
----
_internal
signalk
> use signalk
Using database signalk
> select * from vessels where skey=~/[n][a][v][i][g][a][t][i][o][n][.][p][o][s][i][t][i][o][n]/ and uuid=~/[u][r][n][:][m][r][n][:][s][i][g][n][a][l][k][:][u][u][i][d][:][7][8][5][e][7][e][e][a][-][c][a][e][7][-][4][f][a][0][-][a][e][d][e][-][5][5][9][7][6][7][6][9][b][b][0][c]/ group by skey,primary, uuid,sourceRef order by time desc limit 1
ERR: error parsing query: found /, expected regex at line 1, char 35
yyvus commented 5 years ago

Adding the "" to the query made it work:

select * from vessels where skey="~/[n][a][v][i][g][a][t][i][o][n][.][p][o][s][i][t][i][o][n]/" and uuid="~/[u][r][n][:][m][r][n][:][s][i][g][n][a][l][k][:][u][u][i][d][:][7][8][5][e][7][e][e][a][-][c][a][e][7][-][4][f][a][0][-][a][e][d][e][-][5][5][9][7][6][7][6][9][b][b][0][c]/" group by skey,primary, uuid,sourceRef order by time desc limit 1
yyvus commented 5 years ago

I further identified the source of the issue. I am installing influxdb using the commands:

curl -sL https://repos.influxdata.com/influxdb.key | apt-key add -

If I install influxdb using the following commands:

wget https://dl.influxdata.com/influxdb/releases/influxdb_1.7.5_amd64.deb
dpkg -i influxdb_1.7.5_amd64.deb

Then the error does not show up. It might simply be a version issue? But with the latter installation I have an other error but that might certainly be a configuration error...

signalk       | 10:26:18.440 ERROR o.i.i.BatchProcessor - Batch could not be sent. Data will be lost
signalk       | org.influxdb.InfluxDBIOException: java.net.SocketTimeoutException: timeout
signalk       |     at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:812) ~[influxdb-java-2.15.jar:?]
signalk       |     at org.influxdb.impl.InfluxDBImpl.write(InfluxDBImpl.java:463) ~[influxdb-java-2.15.jar:?]
signalk       |     at org.influxdb.impl.OneShotBatchWriter.write(OneShotBatchWriter.java:22) ~[influxdb-java-2.15.jar:?]
signalk       |     at org.influxdb.impl.BatchProcessor.write(BatchProcessor.java:340) [influxdb-java-2.15.jar:?]
signalk       |     at org.influxdb.impl.BatchProcessor$1.run(BatchProcessor.java:287) [influxdb-java-2.15.jar:?]
signalk       |     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:514) [?:?]
signalk       |     at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
signalk       |     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
signalk       |     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135) [?:?]
signalk       |     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
signalk       |     at java.lang.Thread.run(Thread.java:844) [?:?]
signalk       | Caused by: java.net.SocketTimeoutException: timeout
signalk       |     at okio.Okio$4.newTimeoutException(Okio.java:232) ~[okio-1.17.2.jar:?]
signalk       |     at okio.AsyncTimeout.exit(AsyncTimeout.java:286) ~[okio-1.17.2.jar:?]
signalk       |     at okio.AsyncTimeout$2.read(AsyncTimeout.java:241) ~[okio-1.17.2.jar:?]
signalk       |     at okio.RealBufferedSource.indexOf(RealBufferedSource.java:358) ~[okio-1.17.2.jar:?]
signalk       |     at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:230) ~[okio-1.17.2.jar:?]
signalk       |     at okhttp3.internal.http1.Http1Codec.readHeaderLine(Http1Codec.java:226) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http1.Http1Codec.readResponseHeaders(Http1Codec.java:202) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:101) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:94) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:125) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?]
signalk       |     at org.influxdb.impl.BasicAuthInterceptor.intercept(BasicAuthInterceptor.java:22) ~[influxdb-java-2.15.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?]
signalk       |     at org.influxdb.impl.GzipRequestInterceptor.intercept(GzipRequestInterceptor.java:42) ~[influxdb-java-2.15.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.logging.HttpLoggingInterceptor.intercept(HttpLoggingInterceptor.java:152) ~[logging-interceptor-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:264) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.RealCall.execute(RealCall.java:93) ~[okhttp-3.13.1.jar:?]
signalk       |     at retrofit2.OkHttpCall.execute(OkHttpCall.java:186) ~[retrofit-2.5.0.jar:?]
signalk       |     at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:800) ~[influxdb-java-2.15.jar:?]
signalk       |     ... 10 more
signalk       | Caused by: java.net.SocketException: Socket closed
signalk       |     at java.net.SocketInputStream.read(SocketInputStream.java:204) ~[?:?]
signalk       |     at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[?:?]
signalk       |     at okio.Okio$2.read(Okio.java:140) ~[okio-1.17.2.jar:?]
signalk       |     at okio.AsyncTimeout$2.read(AsyncTimeout.java:237) ~[okio-1.17.2.jar:?]
signalk       |     at okio.RealBufferedSource.indexOf(RealBufferedSource.java:358) ~[okio-1.17.2.jar:?]
signalk       |     at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:230) ~[okio-1.17.2.jar:?]
signalk       |     at okhttp3.internal.http1.Http1Codec.readHeaderLine(Http1Codec.java:226) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http1.Http1Codec.readResponseHeaders(Http1Codec.java:202) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:101) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:94) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:125) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?]
signalk       |     at org.influxdb.impl.BasicAuthInterceptor.intercept(BasicAuthInterceptor.java:22) ~[influxdb-java-2.15.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?]
signalk       |     at org.influxdb.impl.GzipRequestInterceptor.intercept(GzipRequestInterceptor.java:42) ~[influxdb-java-2.15.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.logging.HttpLoggingInterceptor.intercept(HttpLoggingInterceptor.java:152) ~[logging-interceptor-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:264) ~[okhttp-3.13.1.jar:?]
signalk       |     at okhttp3.RealCall.execute(RealCall.java:93) ~[okhttp-3.13.1.jar:?]
signalk       |     at retrofit2.OkHttpCall.execute(OkHttpCall.java:186) ~[retrofit-2.5.0.jar:?]
signalk       |     at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:800) ~[influxdb-java-2.15.jar:?]
signalk       |     ... 10 more
rob42 commented 5 years ago

Huh! Thats weird. My laptop is kubuntu 18.04, (x86), inflx v 1.7.2. I dont need the statement wrapped in "" to work normally. My locale is NZ, so basically US with (proper spelling, date formats, and metric system!). Maybe worth changing locales and seeing what effect it has. Or moving back to v1.7.2?

The current error looks like a connection issue, is signalk in the same docker container as influxdb? If so can you confirm (use curl) that the localhost:8086 port is accessible and working? If signalk is in a different container its probably trying localhost:8086

yyvus commented 5 years ago

So I can't fingure out what the issue is, because I tried to set the locale to en_US.utf8, which, despite the metric system :-), should be equivalent to yours. The ubuntu docker image has no system locale pre-configured. It had not influence on the weird need of quotes for the query.

yyvus commented 5 years ago

I tried then to make it work with the other influxdb installation. The one found here: https://portal.influxdata.com/downloads/. In that case I have the socket timeout issue mentioned above. This timeout has nothing to do with network (localhost:8086) access because the signalk database is present in influx and I can access it from with in the docker. I also tried to have influx running out of the container, which failed too. All the different options I took lead to the same conclusion: either the quotes are missing with the curl-like installation, either there is a socket timeout that should not be active for the dpkg-like install. And I wouldn't be surprised that if the quotes are put in the code, then the timeout issue would still show up. So I believe these are two different problems. Below are the dockerfile and the entrypoint I am using:

FROM ubuntu:18.04

RUN apt-get update && apt-get upgrade --yes && apt-get --yes install \
    apt-utils \
    less \
    net-tools \
    emacs \
    openjdk-11-jdk \
    maven \
    git \
    curl \
    influxdb \
    gnupg \
    wget

RUN mkdir /etc/signalkJavaServer
COPY docker/signalK/signalk-java /etc/signalkJavaServer
COPY docker/signalK/signalk_entrypoint.sh /signalk_entrypoint.sh

ENTRYPOINT ["/signalk_entrypoint.sh"]
CMD ["/bin/bash"]

signalk_entrypoint.sh:

#!/bin/bash
set -e

export JAVA_HOME=$(dirname $(dirname $(readlink -f $(which javac))))

# install influxdb
cd tmp
wget https://dl.influxdata.com/influxdb/releases/influxdb_1.7.5_amd64.deb
dpkg -i influxdb_1.7.5_amd64.deb

# start influxdb server
service influxdb start

# start signalk
cd /etc/signalkJavaServer
mvn exec:exec
yyvus commented 5 years ago

The only way I had it working properly is the following.

1) run the docker image 2) In two different terminals logged in the docker:

terminal1:

cd tmp
wget https://dl.influxdata.com/influxdb/releases/influxdb_1.7.5_amd64.deb
dpkg -i influxdb_1.7.5_amd64.deb

influxd

terminal 2:

cd /etc/signalkJavaServer
mvn exec:exec
yyvus commented 5 years ago

Is there a port that should be opened that I don't know off?

rob42 commented 5 years ago

My laptop shows the following in netstat:

tcp        0      0 127.0.0.1:8088          0.0.0.0:*               LISTEN      1719/influxd
tcp6       0      0 :::8086                 :::*                    LISTEN      1719/influxd

That agrees with https://docs.influxdata.com/influxdb/v1.7/administration/ports/

Since it runs when influxdb and java are started in the foreground that suggests something wrong with running as a service? Are you logged in as root when you start them?

(Assuming you have a cached disk) Has signalk-java been started as root in the past? That may have locked some files? I run influxdb as a service, and on the laptop I run signalk-java as my local user. On the pi they are both run as a service. The signalk-java service starts as user 'pi'

Another thought is some sort of SELinux or apparmour issue? Im not up on docker so I dont know what its behaviour is like for security.

There is a file ./start_pi.sh which actually does the startup. May be worth a look.

BTW I would change this RUN mkdir /etc/signalkJavaServer to RUN mkdir /var/lib/signalkJavaServer or somewhere else outside /etc. By the time you load charts etc you will have Gb of data and code in signalkJavaServer, which is not really what /etc is for.

I will try v1.75 today, and maybe try your docker script too.

rob42 commented 5 years ago

Check the docker container has the correct time too

rob42 commented 5 years ago

Ive tried adding " into the query in the java code and it completely breaks.

select * from vessels where skey="~/[n][a][v][i][g][a][t][i][o][n][.][p][o][s][i][t][i][o][n]/" and uuid="~/[u][r][n][:][m][r][n][:][s][i][g][n][a][l][k][:][u][u][i][d][:][7][8][5][e][7][e][e][a][-][c][a][e][7][-][4][f][a][0][-][a][e][d][e][-][5][5][9][7][6][7][6][9][b][b][0][c]/" group by skey,primary, uuid,sourceRef order by time desc limit 1

But if I add a space its fine.

select * from vessels where skey=~ /[n][a][v][i][g][a][t][i][o][n][.][p][o][s][i][t][i][o][n]/ and uuid=~ /[u][r][n][:][m][r][n][:][s][i][g][n][a][l][k][:][u][u][i][d][:][7][8][5][e][7][e][e][a][-][c][a][e][7][-][4][f][a][0][-][a][e][d][e][-][5][5][9][7][6][7][6][9][b][b][0][c]/ group by skey,primary, uuid,sourceRef order by time desc limit 1

Ive committed that change as it seems more in line with the docs. Can you try it in the influx client pls?

rob42 commented 5 years ago

I'm now running influxdb 1.7.5, installed using wget as you did above. signalk-java works normally. Still works normally for me in influx client:

select * from vessels where skey=~/[n][a][v][i].....  # works
select * from vessels where skey=~ /[n][a][v][i].....  # works
select * from vessels where skey="~/[n][a][v][i]..... # works, no records - its trying to do equals "~/[n][a][v][i]....."
select * from vessels where skey=~"/[n][a][v][i]..... # fails - 'expecting regex..'
select * from vessels where skey=~ "/[n][a][v][i]..... # fails - 'expecting regex..'
yyvus commented 5 years ago

Hi, I finally found what the issue was in my docker. I have a line in my Dockerfile which installs influxdb from the ubuntu packages repository: apt-get install influxdb, line 12 of the Dockerfile. If I remove this line everything works perfectly.

I nevertheless don't explain myself why it was failing...

I can install influxdb following the two procedure, but only work: 1) WORKING INSTALLATION: curl -sL https://repos.influxdata.com/influxdb.key | apt-key add - 2) NOT WORKING INSTALLATION:

wget https://dl.influxdata.com/influxdb/releases/influxdb_1.7.5_amd64.deb
dpkg -i influxdb_1.7.5_amd64.deb
rob42 commented 5 years ago

Good news :-) I hate it when its something simple in the end, and you just cant see it for looking too hard!

See https://github.com/SignalK/signalk-java/blob/master/setup_raspbian.sh line 213, The script adds the influxdb repo then does apt-get install. So same effect as yours. If I look at my laptop apt-get sources Ive had it added in the past (it was disabled going to 18.04) so my install probably came the same way.

BTW if you intend to add an ARM docker you will need the bellsoft jdk11, see the setup_rasbian file. Also if the rasbian install was a 64bit OS the JDK11 would use the new graal compiler, much faster and better, plus it natively runs python, js, ruby, etc.

Pls feel free to make a pull request for the docker file and maybe a Docker.md, and I will add it. Its been requested several times.

tkurki commented 5 years ago
$ docker run -it ubuntu:18.04
root@5e573771c293:/# apt-get update && apt-get upgrade --yes && apt-get --yes install influxdb

will give you a fairly ancient influxdb 1.1.1.

rob42 commented 5 years ago

Re slack question: signalk-java writes to the following dirs: ./conf ./signalk-static/ - installs apps here ./signalk-static/logs ./signalk-static/mapcache - charts here ./storage/ holds large blobs etc to save ram ./data queue overflows

rob42 commented 5 years ago

Ive committed a fix to master and jdk11 to resolve the hostname better. This should fix some of the invalid cert issues as the ws and api hostname will now always equal the browser url. So the 'accept self-signed cert' thing happens on first opening index.html

rob42 commented 5 years ago

I had one case this morning where the server would not re-connect to influxdb, the first Ive seen. No clues what caused it, but restarting influx cleared it. It doesnt happen normally, and I cannot reproduce it. Your issue with this problem sounds the same.

I wonder if the influxdb port (8086) is left hung somehow or maybe blocked by some linux or docker security function?

rob42 commented 5 years ago

Managed to reproduce your db error (socket timeout). My db has 5Mil rows as Ive been running demo files at a high rate lately. I configured influxdb.conf:

max-connection-limit = 20
max-concurrent-write-limit = 20
max-enqueued-write-limit = 50

If I open freeboard-sk the vessel zooms around...but eventually if I try to do much I get a similar stacktrace to you:

org.influxdb.InfluxDBIOException: java.net.SocketTimeoutException: timeout
    at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:812)
    at org.influxdb.impl.InfluxDBImpl.write(InfluxDBImpl.java:463)
    at org.influxdb.impl.OneShotBatchWriter.write(OneShotBatchWriter.java:22)

Sometimes this is caused by the webapp subscription query, which times out. Once this has happened restarting signalk-java doesnt work, same error immediately.

If I connect to influxdb with the influx cli client, queries are very sluggish, often taking seconds or more. If I restart infludb the same queries return almost instantly.

Somethiing is causing influxdb to choke and become very slow.

rob42 commented 5 years ago

The above was on my i7 laptop, linux, 16Gb RAM, so not a cpu/RAM/io constraint.

yyvus commented 5 years ago

Thanks for your replies. I am running everything in local now (on fedora 29) in order to avoid the container polluting the debugging. I paste below a full console output in oreder that you clearly understand my issues. But first this is my installation:

08:47:17.841 INFO o.a.c.AtmosphereFramework - Installed AtmosphereInterceptor nz.co.fortytwo.signalk.artemis.atmosphere.intercept.AuthenticationInterceptor with priority AFTER_DEFAULT 08:47:17.861 INFO n.c.f.s.a.s.ArtemisServer - Starting tcp/udp server 08:47:17.935 INFO n.c.f.s.a.s.ArtemisServer - Starting NMEA output 08:47:17.947 INFO n.c.f.s.a.s.ArtemisServer - Starting scheduled services 08:47:17.948 INFO n.c.f.s.a.s.ArtemisServer - Starting Zeroconf discovery agent 08:47:17.949 INFO n.c.f.s.a.s.ArtemisServer - Starting serial port manager 08:47:18.028 INFO n.c.f.s.a.s.ChartService - Reload charts at startup 08:47:18.031 INFO n.c.f.s.a.s.ChartService - Existing charts: Quan:0 08:47:18.031 INFO n.c.f.s.a.s.ChartService - Reloading charts from: /home/ycourvoisier/Documents/Development/GitLocalRepo/com.enata.wingman.server/docker/signalK/signalk-java/./signalk-static/./mapcache WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.dom4j.io.SAXContentHandler (file:/home/ycourvoisier/.m2/repository/dom4j/dom4j/1.6.1/dom4j-1.6.1.jar) to method com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$LocatorProxy.getEncoding() WARNING: Please consider reporting this to the maintainers of org.dom4j.io.SAXContentHandler WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release 08:47:18.132 INFO n.c.f.s.a.s.ChartService - Loading new chart: WORLD= {"context":"resources","put":[{"path":"charts.urn:mrn:signalk:uuid:5f82ef17-ba38-43db-ae1f-68005b4eeba9","value":{"identifier":"WORLD","maxzoom":4,"name":"World Outline","bounds":[-85.05112878,-180.0,83.64310000000003,179.95295325327942],"format":"png","description":"World Outline","tilemapUrl":"/./mapcache/WORLD/{z}/{x}/{-y}.png","scale":250000,"type":"tilelayer","minzoom":0}}]} 08:47:18.592 INFO n.c.f.s.a.s.ChartService - Loading new chart: WORLD1= {"context":"resources","put":[{"path":"charts.urn:mrn:signalk:uuid:34193406-6e1b-49a2-904a-ea6a061bb703","value":{"identifier":"WORLD1","maxzoom":6,"name":"World Outline 1","bounds":[-85.05112878,-180.0,83.64310000000003,179.97217572748764],"format":"png","description":"World Outline 1","tilemapUrl":"/./mapcache/WORLD1/{z}/{x}/{-y}.png","scale":250000,"type":"tilelayer","minzoom":3}}]} 08:47:18.736 INFO n.c.f.s.a.s.ChartService - Chart resources updated 08:47:18.737 INFO n.c.f.s.a.s.ArtemisServer -

URLS: https://192.168.1.58:8443 https://127.0.0.1:8443

The artemis server is ready!

08:47:28.626 ERROR o.i.i.BatchProcessor - Batch could not be sent. Data will be lost org.influxdb.InfluxDBIOException: java.net.SocketTimeoutException: timeout at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:812) ~[influxdb-java-2.15.jar:?] at org.influxdb.impl.InfluxDBImpl.write(InfluxDBImpl.java:463) ~[influxdb-java-2.15.jar:?] at org.influxdb.impl.OneShotBatchWriter.write(OneShotBatchWriter.java:22) ~[influxdb-java-2.15.jar:?] at org.influxdb.impl.BatchProcessor.write(BatchProcessor.java:340) [influxdb-java-2.15.jar:?] at org.influxdb.impl.BatchProcessor$1.run(BatchProcessor.java:287) [influxdb-java-2.15.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:834) [?:?] Caused by: java.net.SocketTimeoutException: timeout at okio.Okio$4.newTimeoutException(Okio.java:232) ~[okio-1.17.2.jar:?] at okio.AsyncTimeout.exit(AsyncTimeout.java:286) ~[okio-1.17.2.jar:?] at okio.AsyncTimeout$2.read(AsyncTimeout.java:241) ~[okio-1.17.2.jar:?] at okio.RealBufferedSource.indexOf(RealBufferedSource.java:358) ~[okio-1.17.2.jar:?] at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:230) ~[okio-1.17.2.jar:?] at okhttp3.internal.http1.Http1Codec.readHeaderLine(Http1Codec.java:226) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http1.Http1Codec.readResponseHeaders(Http1Codec.java:202) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:101) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:94) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:125) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?] at org.influxdb.impl.BasicAuthInterceptor.intercept(BasicAuthInterceptor.java:22) ~[influxdb-java-2.15.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?] at org.influxdb.impl.GzipRequestInterceptor.intercept(GzipRequestInterceptor.java:42) ~[influxdb-java-2.15.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?] at okhttp3.logging.HttpLoggingInterceptor.intercept(HttpLoggingInterceptor.java:152) ~[logging-interceptor-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?] at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:264) ~[okhttp-3.13.1.jar:?] at okhttp3.RealCall.execute(RealCall.java:93) ~[okhttp-3.13.1.jar:?] at retrofit2.OkHttpCall.execute(OkHttpCall.java:186) ~[retrofit-2.5.0.jar:?] at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:800) ~[influxdb-java-2.15.jar:?] ... 10 more Caused by: java.net.SocketException: Socket closed at java.net.SocketInputStream.read(SocketInputStream.java:183) ~[?:?] at java.net.SocketInputStream.read(SocketInputStream.java:140) ~[?:?] at okio.Okio$2.read(Okio.java:140) ~[okio-1.17.2.jar:?] at okio.AsyncTimeout$2.read(AsyncTimeout.java:237) ~[okio-1.17.2.jar:?] at okio.RealBufferedSource.indexOf(RealBufferedSource.java:358) ~[okio-1.17.2.jar:?] at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:230) ~[okio-1.17.2.jar:?] at okhttp3.internal.http1.Http1Codec.readHeaderLine(Http1Codec.java:226) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http1.Http1Codec.readResponseHeaders(Http1Codec.java:202) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:101) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:94) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:125) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?] at org.influxdb.impl.BasicAuthInterceptor.intercept(BasicAuthInterceptor.java:22) ~[influxdb-java-2.15.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?] at org.influxdb.impl.GzipRequestInterceptor.intercept(GzipRequestInterceptor.java:42) ~[influxdb-java-2.15.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?] at okhttp3.logging.HttpLoggingInterceptor.intercept(HttpLoggingInterceptor.java:152) ~[logging-interceptor-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.13.1.jar:?] at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.13.1.jar:?] at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:264) ~[okhttp-3.13.1.jar:?] at okhttp3.RealCall.execute(RealCall.java:93) ~[okhttp-3.13.1.jar:?] at retrofit2.OkHttpCall.execute(OkHttpCall.java:186) ~[retrofit-2.5.0.jar:?] at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:800) ~[influxdb-java-2.15.jar:?] ... 10 more

In an other terminal, in which `influxd` is running, I can read the following:

influxd

8888888 .d888 888 8888888b. 888888b. 888 d88P" 888 888 "Y88b 888 "88b 888 888 888 888 888 888 .88P 888 88888b. 888888 888 888 888 888 888 888 888 8888888K. 888 888 "88b 888 888 888 888 Y8bd8P' 888 888 888 "Y88b 888 888 888 888 888 888 888 X88K 888 888 888 888 888 888 888 888 888 Y88b 888 .d8""8b. 888 .d88P 888 d88P 8888888 888 888 888 888 "Y88888 888 888 8888888P" 8888888P"

2019-04-10T07:00:28.474026Z info InfluxDB starting {"log_id": "0EiWbdjW000", "version": "1.7.5", "branch": "1.7", "commit": "2f49e00f7f3801a506304c3b6d8165b2f4039f3d"} 2019-04-10T07:00:28.474046Z info Go runtime {"log_id": "0EiWbdjW000", "version": "go1.11", "maxprocs": 4} 2019-04-10T07:00:28.574899Z info Using data dir {"log_id": "0EiWbdjW000", "service": "store", "path": "/var/lib/influxdb/data"} 2019-04-10T07:00:28.574934Z info Compaction settings {"log_id": "0EiWbdjW000", "service": "store", "max_concurrent_compactions": 2, "throughput_bytes_per_second": 50331648, "throughput_bytes_per_second_burst": 50331648} 2019-04-10T07:00:28.574957Z info Open store (start) {"log_id": "0EiWbdjW000", "service": "store", "trace_id": "0EiWbe7W001", "op_name": "tsdb_open", "op_event": "start"} 2019-04-10T07:00:28.590156Z info Opened file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "filestore", "path": "/var/lib/influxdb/data/Foiler Spirit/shortTerm/13/000000005-000000002.tsm", "id": 0, "duration": "0.849ms"} 2019-04-10T07:00:28.590168Z info Opened file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "filestore", "path": "/var/lib/influxdb/data/Foiler Spirit/shortTerm/22/000000003-000000002.tsm", "id": 0, "duration": "1.124ms"} 2019-04-10T07:00:28.590286Z info Reading file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "cacheloader", "path": "/var/lib/influxdb/wal/Foiler Spirit/shortTerm/22/_00001.wal", "size": 56} 2019-04-10T07:00:28.590317Z info Opened file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "filestore", "path": "/var/lib/influxdb/data/Foiler Spirit/shortTerm/17/000000001-000000001.tsm", "id": 0, "duration": "0.095ms"} 2019-04-10T07:00:28.590144Z info Opened file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "filestore", "path": "/var/lib/influxdb/data/Foiler Spirit/shortTerm/18/000000002-000000002.tsm", "id": 0, "duration": "1.064ms"} 2019-04-10T07:00:28.630483Z info Opened shard {"log_id": "0EiWbdjW000", "service": "store", "trace_id": "0EiWbe7W001", "op_name": "tsdb_open", "index_version": "inmem", "path": "/var/lib/influxdb/data/Foiler Spirit/shortTerm/13", "duration": "42.301ms"} 2019-04-10T07:00:28.631026Z info Opened file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "filestore", "path": "/var/lib/influxdb/data/Foiler Spirit/shortTerm/21/000000001-000000001.tsm", "id": 0, "duration": "0.104ms"} 2019-04-10T07:00:28.640178Z info Opened shard {"log_id": "0EiWbdjW000", "service": "store", "trace_id": "0EiWbe7W001", "op_name": "tsdb_open", "index_version": "inmem", "path": "/var/lib/influxdb/data/Foiler Spirit/shortTerm/22", "duration": "51.566ms"} 2019-04-10T07:00:28.640261Z info Opened shard {"log_id": "0EiWbdjW000", "service": "store", "trace_id": "0EiWbe7W001", "op_name": "tsdb_open", "index_version": "inmem", "path": "/var/lib/influxdb/data/Foiler Spirit/shortTerm/17", "duration": "52.149ms"} 2019-04-10T07:00:28.640432Z info Opened shard {"log_id": "0EiWbdjW000", "service": "store", "trace_id": "0EiWbe7W001", "op_name": "tsdb_open", "index_version": "inmem", "path": "/var/lib/influxdb/data/Foiler Spirit/shortTerm/18", "duration": "51.675ms"} 2019-04-10T07:00:28.645020Z info Reading file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "cacheloader", "path": "/var/lib/influxdb/wal/_internal/monitor/27/_00001.wal", "size": 503422} 2019-04-10T07:00:28.646365Z info Opened file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "filestore", "path": "/var/lib/influxdb/data/_internal/monitor/1/000000019-000000002.tsm", "id": 0, "duration": "1.214ms"} 2019-04-10T07:00:28.646467Z info Reading file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "cacheloader", "path": "/var/lib/influxdb/wal/_internal/monitor/1/_00001.wal", "size": 5106802} 2019-04-10T07:00:28.646654Z info Opened file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "filestore", "path": "/var/lib/influxdb/data/_internal/monitor/19/000000007-000000002.tsm", "id": 0, "duration": "1.444ms"} 2019-04-10T07:00:28.844029Z info Opened shard {"log_id": "0EiWbdjW000", "service": "store", "trace_id": "0EiWbe7W001", "op_name": "tsdb_open", "index_version": "inmem", "path": "/var/lib/influxdb/data/Foiler Spirit/shortTerm/21", "duration": "213.480ms"} 2019-04-10T07:00:28.898105Z info Opened file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "filestore", "path": "/var/lib/influxdb/data/_internal/monitor/11/000000007-000000002.tsm", "id": 0, "duration": "52.308ms"} 2019-04-10T07:00:29.013576Z info Opened shard {"log_id": "0EiWbdjW000", "service": "store", "trace_id": "0EiWbe7W001", "op_name": "tsdb_open", "index_version": "inmem", "path": "/var/lib/influxdb/data/_internal/monitor/19", "duration": "373.272ms"} 2019-04-10T07:00:29.094428Z info Opened file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "filestore", "path": "/var/lib/influxdb/data/_internal/monitor/6/000000017-000000002.tsm", "id": 0, "duration": "65.578ms"} 2019-04-10T07:00:29.094584Z info Reading file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "cacheloader", "path": "/var/lib/influxdb/wal/_internal/monitor/6/_00001.wal", "size": 1579924} 2019-04-10T07:00:29.196106Z info Opened shard {"log_id": "0EiWbdjW000", "service": "store", "trace_id": "0EiWbe7W001", "op_name": "tsdb_open", "index_version": "inmem", "path": "/var/lib/influxdb/data/_internal/monitor/27", "duration": "555.874ms"} 2019-04-10T07:00:29.198867Z info Opened file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "filestore", "path": "/var/lib/influxdb/data/_internal/monitor/9/000000017-000000002.tsm", "id": 0, "duration": "1.942ms"} 2019-04-10T07:00:29.198986Z info Reading file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "cacheloader", "path": "/var/lib/influxdb/wal/_internal/monitor/9/_00001.wal", "size": 50250} 2019-04-10T07:00:29.296559Z info Opened shard {"log_id": "0EiWbdjW000", "service": "store", "trace_id": "0EiWbe7W001", "op_name": "tsdb_open", "index_version": "inmem", "path": "/var/lib/influxdb/data/_internal/monitor/11", "duration": "452.252ms"} 2019-04-10T07:00:29.346255Z info Opened file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "filestore", "path": "/var/lib/influxdb/data/signalk/autogen/12/000000017-000000002.tsm", "id": 0, "duration": "36.818ms"} 2019-04-10T07:00:29.346504Z info Reading file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "cacheloader", "path": "/var/lib/influxdb/wal/signalk/autogen/12/_00017.wal", "size": 2866} 2019-04-10T07:00:29.533678Z info Opened shard {"log_id": "0EiWbdjW000", "service": "store", "trace_id": "0EiWbe7W001", "op_name": "tsdb_open", "index_version": "inmem", "path": "/var/lib/influxdb/data/_internal/monitor/9", "duration": "337.470ms"} 2019-04-10T07:00:29.533798Z info Opened shard {"log_id": "0EiWbdjW000", "service": "store", "trace_id": "0EiWbe7W001", "op_name": "tsdb_open", "index_version": "inmem", "path": "/var/lib/influxdb/data/_internal/monitor/1", "duration": "893.340ms"} 2019-04-10T07:00:29.533842Z info Opened shard {"log_id": "0EiWbdjW000", "service": "store", "trace_id": "0EiWbe7W001", "op_name": "tsdb_open", "index_version": "inmem", "path": "/var/lib/influxdb/data/signalk/autogen/12", "duration": "237.225ms"} 2019-04-10T07:00:29.533855Z info Opened shard {"log_id": "0EiWbdjW000", "service": "store", "trace_id": "0EiWbe7W001", "op_name": "tsdb_open", "index_version": "inmem", "path": "/var/lib/influxdb/data/_internal/monitor/6", "duration": "520.227ms"} 2019-04-10T07:00:29.534561Z info Opened file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "filestore", "path": "/var/lib/influxdb/data/signalk/autogen/10/000000017-000000002.tsm", "id": 0, "duration": "0.516ms"} 2019-04-10T07:00:29.534633Z info Reading file {"log_id": "0EiWbdjW000", "engine": "tsm1", "service": "cacheloader", "path": "/var/lib/influxdb/wal/signalk/autogen/10/_00001.wal", "size": 1617} 2019-04-10T07:00:29.571967Z info Opened shard {"log_id": "0EiWbdjW000", "service": "store", "trace_id": "0EiWbe7W001", "op_name": "tsdb_open", "index_version": "inmem", "path": "/var/lib/influxdb/data/signalk/autogen/10", "duration": "38.227ms"} 2019-04-10T07:00:29.572293Z info Open store (end) {"log_id": "0EiWbdjW000", "service": "store", "trace_id": "0EiWbe7W001", "op_name": "tsdb_open", "op_event": "end", "op_elapsed": "997.335ms"} 2019-04-10T07:00:29.572321Z info Opened service {"log_id": "0EiWbdjW000", "service": "subscriber"} 2019-04-10T07:00:29.572330Z info Starting monitor service {"log_id": "0EiWbdjW000", "service": "monitor"} 2019-04-10T07:00:29.572338Z info Registered diagnostics client {"log_id": "0EiWbdjW000", "service": "monitor", "name": "build"} 2019-04-10T07:00:29.572349Z info Registered diagnostics client {"log_id": "0EiWbdjW000", "service": "monitor", "name": "runtime"} 2019-04-10T07:00:29.572360Z info Registered diagnostics client {"log_id": "0EiWbdjW000", "service": "monitor", "name": "network"} 2019-04-10T07:00:29.572372Z info Registered diagnostics client {"log_id": "0EiWbdjW000", "service": "monitor", "name": "system"} 2019-04-10T07:00:29.572392Z info Starting precreation service {"log_id": "0EiWbdjW000", "service": "shard-precreation", "check_interval": "10m", "advance_period": "30m"} 2019-04-10T07:00:29.572401Z info Starting snapshot service {"log_id": "0EiWbdjW000", "service": "snapshot"} 2019-04-10T07:00:29.572407Z info Starting continuous query service {"log_id": "0EiWbdjW000", "service": "continuous_querier"} 2019-04-10T07:00:29.572424Z info Starting HTTP service {"log_id": "0EiWbdjW000", "service": "httpd", "authentication": false} 2019-04-10T07:00:29.572433Z info opened HTTP access log {"log_id": "0EiWbdjW000", "service": "httpd", "path": "stderr"} 2019-04-10T07:00:29.572462Z info Storing statistics {"log_id": "0EiWbdjW000", "service": "monitor", "db_instance": "_internal", "db_rp": "monitor", "interval": "10s"} 2019-04-10T07:00:29.572717Z info Listening on HTTP {"log_id": "0EiWbdjW000", "service": "httpd", "addr": "[::]:8086", "https": false} 2019-04-10T07:00:29.572736Z info Starting retention policy enforcement service {"log_id": "0EiWbdjW000", "service": "retention", "check_interval": "30m"} 2019-04-10T07:00:29.572774Z info Listening for signals {"log_id": "0EiWbdjW000"} 2019-04-10T07:00:29.573759Z info Sending usage statistics to usage.influxdata.com {"log_id": "0EiWbdjW000"} 2019-04-10T07:00:40.612901Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SHOW DATABASES"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:00:40 +0200] "GET /query?q=SHOW+DATABASES HTTP/1.1" 200 133 "-" "okhttp/3.13.1" 59fa19db-5b5e-11e9-8001-9cebe83e6922 699 2019-04-10T07:00:40.624787Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SELECT FROM signalk.autogen.vessels WHERE primary = 'true' GROUP BY skey, primary, uuid, sourceRef ORDER BY time DESC LIMIT 1"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:00:40 +0200] "GET /query?db=signalk&q=select+%2A+from+vessels+where+primary%3D%27true%27+group+by+skey%2Cprimary%2Cuuid%2CsourceRef+order+by+time+desc+limit+1 HTTP/1.1" 200 223 "-" "okhttp/3.13.1" 59fbeaa5-5b5e-11e9-8002-9cebe83e6922 1056 2019-04-10T07:00:40.643135Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SELECT FROM signalk.autogen.config GROUP BY skey ORDER BY time DESC LIMIT 1"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:00:40 +0200] "GET /query?db=signalk&q=select+%2A+from+config++group+by+skey+order+by+time+desc+limit+1 HTTP/1.1" 200 898 "-" "okhttp/3.13.1" 59feb654-5b5e-11e9-8003-9cebe83e6922 3559 [httpd] 127.0.0.1 - admin [10/Apr/2019:09:00:40 +0200] "POST /write?db=signalk&rp=autogen&precision=n&consistency=one HTTP/1.1" 204 0 "-" "okhttp/3.13.1" 5a03b11e-5b5e-11e9-8004-9cebe83e6922 12845 2019-04-10T07:00:41.274242Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SHOW DATABASES"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:00:41 +0200] "GET /query?q=SHOW+DATABASES HTTP/1.1" 200 133 "-" "okhttp/3.13.1" 5a5f0584-5b5e-11e9-8005-9cebe83e6922 333 2019-04-10T07:00:41.281363Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SHOW DATABASES"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:00:41 +0200] "GET /query?q=SHOW+DATABASES HTTP/1.1" 200 133 "-" "okhttp/3.13.1" 5a601bf7-5b5e-11e9-8006-9cebe83e6922 804 2019-04-10T07:00:41.286392Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SHOW DATABASES"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:00:41 +0200] "GET /query?q=SHOW+DATABASES HTTP/1.1" 200 133 "-" "okhttp/3.13.1" 5a60df1d-5b5e-11e9-8007-9cebe83e6922 305 2019-04-10T07:00:41.292204Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SHOW DATABASES"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:00:41 +0200] "GET /query?q=SHOW+DATABASES HTTP/1.1" 200 133 "-" "okhttp/3.13.1" 5a61c238-5b5e-11e9-8008-9cebe83e6922 308 2019-04-10T07:00:41.298876Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SHOW DATABASES"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:00:41 +0200] "GET /query?q=SHOW+DATABASES HTTP/1.1" 200 133 "-" "okhttp/3.13.1" 5a62c617-5b5e-11e9-8009-9cebe83e6922 340 2019-04-10T07:00:41.304175Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SHOW DATABASES"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:00:41 +0200] "GET /query?q=SHOW+DATABASES HTTP/1.1" 200 133 "-" "okhttp/3.13.1" 5a6396e4-5b5e-11e9-800a-9cebe83e6922 301 2019-04-10T07:00:41.312212Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SHOW DATABASES"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:00:41 +0200] "GET /query?q=SHOW+DATABASES HTTP/1.1" 200 133 "-" "okhttp/3.13.1" 5a64d04c-5b5e-11e9-800b-9cebe83e6922 465 2019-04-10T07:00:41.469203Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SHOW DATABASES"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:00:41 +0200] "GET /query?q=SHOW+DATABASES HTTP/1.1" 200 133 "-" "okhttp/3.13.1" 5a7cc2d9-5b5e-11e9-800c-9cebe83e6922 393 2019-04-10T07:01:05.782662Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SHOW DATABASES"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:01:05 +0200] "GET /query?q=SHOW+DATABASES HTTP/1.1" 200 133 "-" "okhttp/3.13.1" 68fab50c-5b5e-11e9-800d-9cebe83e6922 524 2019-04-10T07:01:05.824610Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SHOW DATABASES"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:01:05 +0200] "GET /query?q=SHOW+DATABASES HTTP/1.1" 200 133 "-" "okhttp/3.13.1" 6901171c-5b5e-11e9-800e-9cebe83e6922 707 2019-04-10T07:01:05.862462Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SHOW DATABASES"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:01:05 +0200] "GET /query?q=SHOW+DATABASES HTTP/1.1" 200 133 "-" "okhttp/3.13.1" 6906e230-5b5e-11e9-800f-9cebe83e6922 527 2019-04-10T07:01:05.864871Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SELECT FROM signalk.autogen.vessels WHERE skey =~ /navigation.anchor/ GROUP BY skey, primary, uuid, sourceRef ORDER BY time DESC LIMIT 1"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:01:05 +0200] "GET /query?db=signalk&q=select+%2A+from+vessels+where+skey%3D~+%2Fnavigation.anchor%2F+group+by+skey%2Cprimary%2C+uuid%2CsourceRef+order+by+time+desc+limit+1 HTTP/1.1" 200 57 "-" "okhttp/3.13.1" 69073b5a-5b5e-11e9-8010-9cebe83e6922 601 2019-04-10T07:01:05.902642Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SHOW DATABASES"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:01:05 +0200] "GET /query?q=SHOW+DATABASES HTTP/1.1" 200 133 "-" "okhttp/3.13.1" 690ce58c-5b5e-11e9-8011-9cebe83e6922 1534 2019-04-10T07:01:05.911017Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SELECT FROM signalk.autogen.vessels WHERE skey =~ /meta/ GROUP BY skey, primary, uuid, sourceRef ORDER BY time DESC LIMIT 1"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:01:05 +0200] "GET /query?db=signalk&q=select+%2A+from+vessels+where+skey%3D~+%2Fmeta%2F+group+by+skey%2Cprimary%2C+uuid%2CsourceRef+order+by+time+desc+limit+1 HTTP/1.1" 200 57 "-" "okhttp/3.13.1" 690e3704-5b5e-11e9-8012-9cebe83e6922 1261 2019-04-10T07:01:10.979381Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SHOW DATABASES"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:01:10 +0200] "GET /query?q=SHOW+DATABASES HTTP/1.1" 200 133 "-" "okhttp/3.13.1" 6c13a83f-5b5e-11e9-8013-9cebe83e6922 387 2019-04-10T07:01:10.991341Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SHOW DATABASES"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:01:10 +0200] "GET /query?q=SHOW+DATABASES HTTP/1.1" 200 133 "-" "okhttp/3.13.1" 6c157b03-5b5e-11e9-8014-9cebe83e6922 449 2019-04-10T07:01:11.028458Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SHOW DATABASES"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:01:11 +0200] "GET /query?q=SHOW+DATABASES HTTP/1.1" 200 133 "-" "okhttp/3.13.1" 6c1b2378-5b5e-11e9-8015-9cebe83e6922 503 2019-04-10T07:01:11.029133Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SELECT FROM signalk.autogen.vessels WHERE skey =~ /[n][a][v][i][g][a][t][i][o][n][.][p][o][s][i][t][i][o][n]/ AND uuid =~ /[u][r][n][:][m][r][n][:][s][i][g][n][a][l][k][:][u][u][i][d][:][d][b][0][7][8][d][7][d][-][6][6][e][3][-][4][c][a][b][-][8][f][b][8][-][9][a][9][9][5][1][3][3][5][e][6][5]/ GROUP BY skey, primary, uuid, sourceRef ORDER BY time DESC LIMIT 1"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:01:11 +0200] "GET /query?db=signalk&q=select+%2A+from+vessels+where+skey%3D~+%2F%5Bn%5D%5Ba%5D%5Bv%5D%5Bi%5D%5Bg%5D%5Ba%5D%5Bt%5D%5Bi%5D%5Bo%5D%5Bn%5D%5B.%5D%5Bp%5D%5Bo%5D%5Bs%5D%5Bi%5D%5Bt%5D%5Bi%5D%5Bo%5D%5Bn%5D%2F+and+uuid%3D~+%2F%5Bu%5D%5Br%5D%5Bn%5D%5B%3A%5D%5Bm%5D%5Br%5D%5Bn%5D%5B%3A%5D%5Bs%5D%5Bi%5D%5Bg%5D%5Bn%5D%5Ba%5D%5Bl%5D%5Bk%5D%5B%3A%5D%5Bu%5D%5Bu%5D%5Bi%5D%5Bd%5D%5B%3A%5D%5Bd%5D%5Bb%5D%5B0%5D%5B7%5D%5B8%5D%5Bd%5D%5B7%5D%5Bd%5D%5B-%5D%5B6%5D%5B6%5D%5Be%5D%5B3%5D%5B-%5D%5B4%5D%5Bc%5D%5Ba%5D%5Bb%5D%5B-%5D%5B8%5D%5Bf%5D%5Bb%5D%5B8%5D%5B-%5D%5B9%5D%5Ba%5D%5B9%5D%5B9%5D%5B5%5D%5B1%5D%5B3%5D%5B3%5D%5B5%5D%5Be%5D%5B6%5D%5B5%5D%2F+group+by+skey%2Cprimary%2C+uuid%2CsourceRef+order+by+time+desc+limit+1 HTTP/1.1" 200 57 "-" "okhttp/3.13.1" 6c1b3939-5b5e-11e9-8016-9cebe83e6922 743 2019-04-10T07:01:11.037037Z info Executing query {"log_id": "0EiWbdjW000", "service": "query", "query": "SELECT FROM signalk.autogen.resources WHERE uuid =~ /charts/ GROUP BY skey, primary, uuid, sourceRef ORDER BY time DESC LIMIT 1"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:01:11 +0200] "GET /query?db=signalk&q=select+%2A+from+resources++where+uuid%3D~+%2Fcharts%2F+group+by+skey%2C+primary%2C+uuid%2C+sourceRef+order+by+time+desc+limit+1 HTTP/1.1" 200 57 "-" "okhttp/3.13.1" 6c1c7440-5b5e-11e9-8017-9cebe83e6922 384 [httpd] 127.0.0.1 - admin [10/Apr/2019:09:01:11 +0200] "POST /write?db=signalk&rp=autogen&precision=n&consistency=one HTTP/1.1" 500 20 "-" "okhttp/3.13.1" 6c8f6c9d-5b5e-11e9-8018-9cebe83e6922 10000314 2019-04-10T07:01:21.790648Z error [500] - "timeout" {"log_id": "0EiWbdjW000", "service": "httpd"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:01:22 +0200] "POST /write?db=signalk&rp=autogen&precision=n&consistency=one HTTP/1.1" 500 20 "-" "okhttp/3.13.1" 72acf417-5b5e-11e9-8019-9cebe83e6922 10000285 2019-04-10T07:01:32.050460Z error [500] - "timeout" {"log_id": "0EiWbdjW000", "service": "httpd"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:01:32 +0200] "POST /write?db=signalk&rp=autogen&precision=n&consistency=one HTTP/1.1" 500 20 "-" "okhttp/3.13.1" 78c96f41-5b5e-11e9-801a-9cebe83e6922 10000448 2019-04-10T07:01:42.303608Z error [500] - "timeout" {"log_id": "0EiWbdjW000", "service": "httpd"} [httpd] 127.0.0.1 - admin [10/Apr/2019:09:01:42 +0200] "POST /write?db=signalk&rp=autogen&precision=n&consistency=one HTTP/1.1" 500 20 "-" "okhttp/3.13.1" 7ee61815-5b5e-11e9-801b-9cebe83e6922 10000333 2019-04-10T07:01:52.557643Z error [500] - "timeout" {"log_id": "0EiWbdjW000", "service": "httpd"} ^C2019-04-10T07:02:26.546803Z info Signal received, initializing clean shutdown... {"log_id": "0EiWbdjW000"} 2019-04-10T07:02:26.546859Z info Waiting for clean shutdown... {"log_id": "0EiWbdjW000"} 2019-04-10T07:02:26.546936Z info Listener closed {"log_id": "0EiWbdjW000", "service": "snapshot"} 2019-04-10T07:02:56.546989Z info Time limit reached, initializing hard shutdown {"log_id": "0EiWbdjW000"}


Note that I have more influxd "timeout" than `SocketTimeoutException` on the signalk side.
yyvus commented 5 years ago

I have the feeling that the restart issue is linked to the socketTimeOut problem. This problem seemed to have disappeared a couple of days a go, but reappeared right after I felt the solution was found... I am sorry about this whole issue. I do my best to find the correct causes to help you solve it.

Could you reproduce the error on your computer with the docker installation? or is it only on my computer?

rob42 commented 5 years ago

I can reproduce it now (without docker) but I dont understand whats happening or causing it. As above I have lots of RAM/CPU etc and the load is not high. There are some old issues on influx github site about similar things, maybe start here https://github.com/influxdata/influxdb/issues?q=8500

krital commented 5 years ago

You could try to adjust the jdk socket timeout which will only make the failures occur faster (and hopefully figure out how to avoid). These properties specify the default connect and read timeout (resp.) for the protocol handler used by java.net.URLConnection:

sun.net.client.defaultConnectTimeout (default: -1) : specifies the timeout (in milliseconds) to establish the connection to the host. For example for http connections it is the timeout when establishing the connection to the http server.

sun.net.client.defaultReadTimeout (default: -1) : specifies the timeout (in milliseconds) when reading from input stream when a connection is established to a resource.

The default values of -1 means this is disabled unless an application has set these values at runtime. You could try setting these to say 5000 (5 secs) and. you should at least get your failures happening faster.

krital commented 5 years ago

By reading the thread (have not tried this locally) I think what is happening is that a batch was being written to influx while restarting Artemis (which blocks the shutdown so as to not loose this data). Eventually the socket times out, data is lost and restart completes, unless in the mean time you restart influx which achieves the same faster.

rob42 commented 5 years ago

under the covers the java driver uses okhttp client. There are timeout configs on that too.

The errors are not being caused by restarting artemis. The connection errors start during normal operation, although it may explain why influxdb needs a restart. If left for a decent interval it might cleanup and recover.

Need to start by getting down to a minimal reproducible case I think.

yyvus commented 5 years ago

On my side I have an other soft communicating with influxdb, and it always have the communication available. Hence I have the feeling that either:

rob42 commented 5 years ago

Looking at those lines (adjusting timestamp for TZ) they are 10 secs apart. The default timeout for the influxdb-java driver appears to be 10sec. But a write should not take 10 sec so underlying that is something that causes the write q to block? There are similar issues on influx github but they should be fixed by now.

rob42 commented 5 years ago

BTW that write is typical of signalk-java writing into the db, either incoming updates or maybe configs etc

rob42 commented 5 years ago

See https://github.com/influxdata/influxdb/issues/13010, seems 1.7.5 is broken, I will try 1.7.4

rob42 commented 5 years ago

Its stable now (at least its survived the test that killed it previously). Apply this (adjust bionic to your linux version):

echo "deb https://repos.influxdata.com/ubuntu bionic stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
sudo apt-get update
sudo apt-get install influxdb=1.7.4-1
sudo service influxdb restart

I think the reason I had not seen this (way back at the start of this issue) is that I was using 1.7.2. I was only able to see it when I upgraded to 1.7.5, to mirror your install.

rob42 commented 5 years ago

Also need to check this https://github.com/influxdata/influxdb/issues/13252 Would explain some issues Ive seen when testing under very heavy (maximum) load. Basically all runs well on input until you open freeboard-sk. Thats good for a bit then disconnects suddenly and the input queues grow rapidly.

yyvus commented 5 years ago

Works like a charm with influxdb 1.7.4-1.

Summary: with this issue we solved two issues:

Thanks a lot for your efforts and availability. I will provide the docker installation I have setup when cleaned up.

FWI: I am currently setting up a NMEA 2000 and 0183 network that I will connect on the NVIDIA Xavier, on which signalk will be running. I will keep you posted if your are interested.