onyx-platform / onyx-dashboard

Dashboard for the Onyx distributed processing system
http://www.onyxplatform.org/
Eclipse Public License 1.0
76 stars 26 forks source link

Onyx Dashboard Not Showing Peer/Job Information #64

Closed stevenmccord closed 7 years ago

stevenmccord commented 8 years ago

When the docker image starts the UI starts, and we see the tenancy-id, but it doesn't appear that any job information is being presented. Once we select the tenancy-id, then it errors out as follows.

Starting Sente
Starting HTTP Server
Http-kit server is running at http://localhost:80/
Connected:  10.16.27.1 84c4a6af-a235-4f6d-9b41-a71106ac14e5
Connected:  10.16.27.1 6914f4a8-6e4e-4bed-b1fa-f4d118a7dc33
16-Sep-09 17:50:27 corey-haim-master-latest-1473443379-lpsjb INFO [onyx-dashboard.tenancy] - Starting Track Tenancy manager for tenancy  1 {:zookeeper/address "10.16.27.224:2181", :onyx.peer/job-scheduler :not-required/for-peer-sub, :onyx.messaging/impl :aeron, :onyx.messaging/bind-addr "localhost", :onyx/tenancy-id "1"} 6914f4a8-6e4e-4bed-b1fa-f4d118a7dc33
16-Sep-09 17:50:27 corey-haim-master-latest-1473443379-lpsjb INFO [onyx.log.zookeeper] - Starting ZooKeeper client connection. If Onyx hangs here it may indicate a difficulty connecting to ZooKeeper.
16-Sep-09 17:50:27 corey-haim-master-latest-1473443379-lpsjb FATAL [onyx.log.zookeeper] -
                              java.lang.Thread.run              Thread.java:  745
java.util.concurrent.ThreadPoolExecutor$Worker.run  ThreadPoolExecutor.java:  617
 java.util.concurrent.ThreadPoolExecutor.runWorker  ThreadPoolExecutor.java: 1142
                                               ...
                 clojure.core.async/thread-call/fn                async.clj:  434
                          onyx.log.zookeeper/fn/fn            zookeeper.clj:  253
                                               ...
                             onyx.log.zookeeper/fn            zookeeper.clj:  568
                             onyx.log.zookeeper/fn            zookeeper.clj:  570
      onyx.monitoring.measurements/measure-latency         measurements.clj:   11
                          onyx.log.zookeeper/fn/fn            zookeeper.clj:  571
    onyx.log.zookeeper/clean-up-broken-connections            zookeeper.clj:   80
                       onyx.log.zookeeper/fn/fn/fn            zookeeper.clj:  574
       onyx.compression.nippy/zookeeper-decompress                nippy.clj:   35
                               taoensso.nippy/thaw                nippy.clj:  744
                     taoensso.nippy/thaw/thaw-data                nippy.clj:  718
                      taoensso.nippy/thaw-from-in!                nippy.clj:  582
                        taoensso.nippy/read-sm-kvs                nippy.clj:  487
                   taoensso.encore/repeatedly-into               encore.clj: 1308
                     taoensso.nippy/read-sm-kvs/fn                nippy.clj:  488
                      taoensso.nippy/thaw-from-in!                nippy.clj:  582
                        taoensso.nippy/read-sm-kvs                nippy.clj:  487
                   taoensso.encore/repeatedly-into               encore.clj: 1304
                     taoensso.nippy/read-sm-kvs/fn                nippy.clj:  488
                      taoensso.nippy/thaw-from-in!                nippy.clj:  629
                       taoensso.nippy/read-custom!                nippy.clj:  509
                              clojure.core/ex-info                 core.clj: 4617
clojure.lang.ExceptionInfo: No reader provided for custom type with internal id: 19
    internal-type-id: 19
clojure.lang.ExceptionInfo: Thaw failed against type-id: 19
    type-id: 19
clojure.lang.ExceptionInfo: Thaw failed against type-id: 112
    type-id: 112
clojure.lang.ExceptionInfo: Thaw failed against type-id: 112
    type-id: 112
clojure.lang.ExceptionInfo: Thaw failed: Decryption/decompression failure, or data unfrozen/damaged.
    opts: {:v1-compatibility? false, :compressor :auto, :encryptor :auto}

We are currently utilizing these onyx dependencies:

org.onyxplatform/lib-onyx "0.9.7.1"
org.onyxplatform/onyx "0.9.9"
org.onyxplatform/onyx-kafka "0.9.9.0"
org.onyxplatform/onyx-metrics "0.9.9.0"

Also, as we talked about in the gitter channel we have confirmed that we have logs in the zookeeper instance:

bash-4.3# . zkCli.sh
Connecting to localhost:2181
2016-09-09 15:03:50,813 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.8--1, built on 02/06/2016 03:18 GMT
2016-09-09 15:03:50,823 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=amanda-bynes-master-latest-1473429486-pf46x
2016-09-09 15:03:50,823 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_72-internal
2016-09-09 15:03:50,830 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2016-09-09 15:03:50,830 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/lib/jvm/java-1.8-openjdk/jre
2016-09-09 15:03:50,830 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/opt/zookeeper/bin/../build/classes:/opt/zookeeper/bin/../build/lib/*.jar:/opt/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/opt/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/opt/zookeeper/bin/../lib/netty-3.7.0.Final.jar:/opt/zookeeper/bin/../lib/log4j-1.2.16.jar:/opt/zookeeper/bin/../lib/jline-0.9.94.jar:/opt/zookeeper/bin/../zookeeper-3.4.8.jar:/opt/zookeeper/bin/../src/java/lib/*.jar:/opt/zookeeper/bin/../conf:
2016-09-09 15:03:50,831 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/lib/jvm/java-1.8-openjdk/jre/lib/amd64/server:/usr/lib/jvm/java-1.8-openjdk/jre/lib/amd64:/usr/lib/jvm/java-1.8-openjdk/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2016-09-09 15:03:50,831 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2016-09-09 15:03:50,831 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2016-09-09 15:03:50,832 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2016-09-09 15:03:50,832 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2016-09-09 15:03:50,833 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=3.19.0-25-generic
2016-09-09 15:03:50,833 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=root
2016-09-09 15:03:50,833 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/root
2016-09-09 15:03:50,833 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/opt/zookeeper/bin
2016-09-09 15:03:50,838 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@3eb07fd3
Welcome to ZooKeeper!
2016-09-09 15:03:50,896 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error)
JLine support is enabled
2016-09-09 15:03:51,073 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@876] - Socket connection established to localhost/0:0:0:0:0:0:0:1:2181, initiating session
2016-09-09 15:03:51,096 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x1570f40b758002e, negotiated timeout = 30000

[zk: localhost:2181(CONNECTED) 0] ls /onyx/1/log/
Command failed: java.lang.IllegalArgumentException: Path must not end with / character
[zk: localhost:2181(CONNECTED) 1] ls /onyx/1/log
[entry-0000000020, entry-0000000021, entry-0000000022, entry-0000000023, entry-0000000024, entry-0000000025, entry-0000000026, entry-0000000060, entry-0000000061, entry-0000000016, entry-0000000017, entry-0000000018, entry-0000000019, entry-0000000052, entry-0000000053, entry-0000000010, entry-0000000054, entry-0000000011, entry-0000000055, entry-0000000012, entry-0000000056, entry-0000000013, entry-0000000057, entry-0000000014, entry-0000000058, entry-0000000015, entry-0000000059, entry-0000000050, entry-0000000051, entry-0000000005, entry-0000000049, entry-0000000006, entry-0000000007, entry-0000000008, entry-0000000009, entry-0000000041, entry-0000000042, entry-0000000043, entry-0000000000, entry-0000000044, entry-0000000001, entry-0000000045, entry-0000000002, entry-0000000046, entry-0000000003, entry-0000000047, entry-0000000004, entry-0000000048, entry-0000000040, entry-0000000038, entry-0000000039, entry-0000000030, entry-0000000031, entry-0000000032, entry-0000000033, entry-0000000034, entry-0000000035, entry-0000000036, entry-0000000037, entry-0000000027, entry-0000000028, entry-0000000029]

We started utilizing our own uberjar built from source, but we also tried this with the 0.9.9.0 docker image and we get the same error.

Another note is that when onyx does start in the peer here is the job-id, which doesn't match above, so it looks like there are different job ids??

We have been chatting about this in gitter, and just opening up a ticket on this front.

lbradstreet commented 7 years ago

Turns out that this was due to explicitly depending on a different version of nippy in the peer.