DecentralizedAmateurPagingNetwork / Core

The DAPNET Core is the core application of DAPNET, responsible for handling transmitter clients, clustering, and providing the REST API.
https://www.afu.rwth-aachen.de/projekte/funkruf-pager-pocsag/funkrufmaster-2-0-dapnet
42 stars 11 forks source link

Neuen Cluster erstellen #95

Closed untergrundbiber closed 7 years ago

untergrundbiber commented 7 years ago

Hallo, ich würde gerne erst mal im privaten Umfeld DAP testen und deswegen einen neuen Cluster erstellen, leider bleibe ich genau da hängen. Egal was ich in der ClusterConfig.xml eintrage, der Core gibt Fehler aus dass das Cluster nicht erstellt werden kann und nach kurzer Zeit killt sich der Code von alleine. Ich hoffe eine Standalone-Installationen wird von euch überhaupt schon unterstützt denn findet das System nämlich echt super und würde es gerne selber austesten. Gruß

ClusterConfig.xml

  ~ DAPNET CORE PROJECT
  ~ Copyright (C) 2016
  ~
  ~ Daniel Sialkowski
  ~
  ~ daniel.sialkowski@rwth-aachen.de
  ~
  ~ Institute of High Frequency Technology
  ~ RWTH AACHEN UNIVERSITY
  ~ Melatener Str. 25
  ~ 52074 Aachen
  -->

<!--Do not change anything but the marked parameters!!!-->
<config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xmlns="urn:org:jgroups"
        xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/jgroups.xsd">
    <TCP bind_port="7800"/>
<!-- FOR CORE BEHIND NAT USE
   <TCP bind_port="7800"
         external_addr="44.225.x.x"
         loopback="true"/>
-->
    <PDC cache_dir="knownHosts" />
    <!--Add here known initial hosts: HostA[7800],HostB[7800]...-->
    <TCPPING initial_hosts="127.0.0.1[7800]"
             return_entire_cache="true"
             use_disk_cache="true"
             port_range="0"/>
    <!--Enter the pre shared auth code here-->
    <AUTH auth_class="org.jgroups.auth.SimpleToken"
          auth_value="authkey1"/>
    <MERGE3 check_interval="15000"
            max_interval="10000"
            min_interval="5000"/>
    <FD_SOCK/>
    <FD/>
    <VERIFY_SUSPECT/>
    <pbcast.NAKACK2 use_mcast_xmit="false"/>
    <UNICAST3/>
    <pbcast.STABLE/>
    <MFC/>
    <FRAG2/>
    <pbcast.STATE_TRANSFER/>
    <pbcast.FLUSH timeout="2000"/>
    <!--Enter here NodeName@ClusterName -->
    <pbcast.GMS name="node1@testcluster"
                print_local_addr="true"/>
</config>

Log:

$ java -Dlog4j.configurationFile=../local/config/LogSettings_REST.xml -jar ../target/dapnet-core-1.1.3.4.jar
21:18:45.371 [main] INFO  org.dapnet.core.DAPNETCore - Starting DAPNETCore Version 1.1.3.4 ...
21:18:45.372 [main] INFO  org.dapnet.core.DAPNETCore - Starting TransmissionManager
21:18:45.407 [main] INFO  org.dapnet.core.DAPNETCore - Starting Cluster
21:18:45.424 [main] INFO  org.hibernate.validator.internal.util.Version - HV000001: Hibernate Validator 5.3.5.Final

-------------------------------------------------------------------
GMS: address=node1, cluster=testcluster1.1.3.4, physical address=10.0.1.74:7800
-------------------------------------------------------------------
21:18:47.808 [main] WARN  org.jgroups.protocols.pbcast.FLUSH - node1: waiting for UNBLOCK timed out after 2000 ms
21:18:50.817 [ViewHandler] INFO  org.dapnet.core.cluster.MembershipListener - New View: [node1]
21:18:50.819 [ViewHandler] INFO  org.dapnet.core.model.State - Successfully wrote state to file
21:18:50.822 [main] WARN  org.dapnet.core.cluster.ChannelListener - Creating new Cluster: Check configuration and restart in case you want to join an existing one
21:18:50.822 [main] INFO  org.dapnet.core.cluster.ChannelListener - Creating first node
21:18:52.827 [main] WARN  org.jgroups.protocols.pbcast.FLUSH - node1: unblocking after 2000ms
21:18:52.897 [Incoming-1,testcluster1.1.3.4,node1] WARN  org.dapnet.core.cluster.RpcListener - PutNode Node{status=ONLINE, name='node1'}: VALIDATION_ERROR
21:18:52.900 [main] ERROR org.dapnet.core.cluster.ClusterManager - Response: {node1=sender=node1value=VALIDATION_ERROR, received=true, suspected=false}
21:18:52.900 [main] FATAL org.dapnet.core.cluster.ClusterManager - Insecure Cluster State
21:18:52.900 [main] FATAL org.dapnet.core.cluster.ChannelListener - First node could not been created
21:18:52.900 [main] INFO  org.dapnet.core.DAPNETCore - Stopping DAPNETCore ...
21:18:52.901 [main] INFO  org.dapnet.core.DAPNETCore - DAPNETCore stopped
21:18:52.901 [main] INFO  org.dapnet.core.cluster.ChannelListener - Creating first user
21:18:53.016 [Incoming-1,testcluster1.1.3.4,node1] INFO  org.dapnet.core.cluster.RpcListener - PutUser User{name='admin'}: OK
21:18:53.016 [main] INFO  org.dapnet.core.cluster.ChannelListener - First user successfully updated
21:18:53.033 [main] INFO  org.dapnet.core.DAPNETCore - Starting SchedulerManager
21:18:53.117 [main] INFO  org.quartz.impl.StdSchedulerFactory - Using default implementation for ThreadExecutor
21:18:53.118 [main] INFO  org.quartz.simpl.SimpleThreadPool - Job execution threads will use class loader of thread: main
21:18:53.127 [main] INFO  org.quartz.core.SchedulerSignalerImpl - Initialized Scheduler Signaller of type: class org.quartz.core.SchedulerSignalerImpl
21:18:53.127 [main] INFO  org.quartz.core.QuartzScheduler - Quartz Scheduler v.2.2.3 created.
21:18:53.127 [main] INFO  org.quartz.simpl.RAMJobStore - RAMJobStore initialized.
21:18:53.128 [main] INFO  org.quartz.core.QuartzScheduler - Scheduler meta-data: Quartz Scheduler (v2.2.3) 'DefaultQuartzScheduler' with instanceId 'NON_CLUSTERED'
  Scheduler class: 'org.quartz.core.QuartzScheduler' - running locally.
  NOT STARTED.
  Currently in standby mode.
  Number of jobs executed: 0
  Using thread pool 'org.quartz.simpl.SimpleThreadPool' - with 10 threads.
  Using job-store 'org.quartz.simpl.RAMJobStore' - which does not support persistence. and is not clustered.

21:18:53.128 [main] INFO  org.quartz.impl.StdSchedulerFactory - Quartz scheduler 'DefaultQuartzScheduler' initialized from default resource file in Quartz package: 'quartz.properties'
21:18:53.128 [main] INFO  org.quartz.impl.StdSchedulerFactory - Quartz scheduler version: 2.2.3
21:18:53.128 [main] INFO  org.quartz.core.QuartzScheduler - Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED started.
21:18:53.144 [main] INFO  org.dapnet.core.scheduler.SchedulerManager - SchedulerManager successfully started
21:18:53.145 [main] INFO  org.dapnet.core.DAPNETCore - Starting RestManager
21:18:54.006 [main] INFO  org.dapnet.core.rest.RestManager - RestApi successfully started.
21:18:54.006 [main] INFO  org.dapnet.core.DAPNETCore - Starting Transmitter Server
21:18:54.080 [main] INFO  org.dapnet.core.transmission.TransmitterServer - Server started on port: 43434
21:18:54.080 [main] INFO  org.dapnet.core.DAPNETCore - DAPNETCore started
21:19:00.007 [DefaultQuartzScheduler_Worker-1] INFO  org.dapnet.core.transmission.TransmissionManager - Time sent to transmitters.
Killed
Taronyu commented 7 years ago

Das Problem war wohl, dass wir letztens den Node Owner als erforderlich zur Klasse hinzugefügt haben (wegen der Emailadresse nehme ich an). Der admin User wurde bei der Erstellung des initialen Knotens nicht als Owner eingetragen und das Feld darf nicht leer bleiben. Daher ist die Erzeugung des ersten Knotens fehlgeschlagen.

untergrundbiber commented 7 years ago

Danke für den schnellen Fix, allerdings bleibt noch das Problem das sich der Core nach einigen Minuten immer selber killt

$ java -Dlog4j.configurationFile=../local/config/LogSettings_REST.xml -jar ../target/dapnet-core-1.1.3.5.jar
18:49:31.737 [main] INFO  org.dapnet.core.DAPNETCore - Starting DAPNETCore Version 1.1.3.5 ...
18:49:31.739 [main] INFO  org.dapnet.core.DAPNETCore - Starting TransmissionManager
18:49:31.775 [main] INFO  org.dapnet.core.DAPNETCore - Starting Cluster
18:49:31.793 [main] INFO  org.hibernate.validator.internal.util.Version - HV000001: Hibernate Validator 5.3.5.Final

-------------------------------------------------------------------
GMS: address=node1, cluster=testcluster1.1.3.5, physical address=10.0.1.74:7800
-------------------------------------------------------------------
18:49:34.217 [main] WARN  org.jgroups.protocols.pbcast.FLUSH - node1: waiting for UNBLOCK timed out after 2000 ms
18:49:37.227 [main] WARN  org.dapnet.core.cluster.ChannelListener - Creating new Cluster: Check configuration and restart in case you want to join an existing one
18:49:37.230 [main] INFO  org.dapnet.core.model.State - Successfully wrote state to file
18:49:37.230 [main] INFO  org.dapnet.core.cluster.ChannelListener - First node successfully updated
18:49:37.238 [ViewHandler] INFO  org.dapnet.core.cluster.MembershipListener - New View: [node1]
18:49:37.270 [main] INFO  org.dapnet.core.DAPNETCore - Starting SchedulerManager
18:49:37.280 [ViewHandler] INFO  org.dapnet.core.cluster.ClusterManager - Cluster has Quorum
18:49:37.281 [ViewHandler] INFO  org.dapnet.core.model.State - Successfully wrote state to file
18:49:37.306 [main] INFO  org.quartz.impl.StdSchedulerFactory - Using default implementation for ThreadExecutor
18:49:37.307 [main] INFO  org.quartz.simpl.SimpleThreadPool - Job execution threads will use class loader of thread: main
18:49:37.324 [main] INFO  org.quartz.core.SchedulerSignalerImpl - Initialized Scheduler Signaller of type: class org.quartz.core.SchedulerSignalerImpl
18:49:37.324 [main] INFO  org.quartz.core.QuartzScheduler - Quartz Scheduler v.2.2.3 created.
18:49:37.325 [main] INFO  org.quartz.simpl.RAMJobStore - RAMJobStore initialized.
18:49:37.325 [main] INFO  org.quartz.core.QuartzScheduler - Scheduler meta-data: Quartz Scheduler (v2.2.3) 'DefaultQuartzScheduler' with instanceId 'NON_CLUSTERED'
  Scheduler class: 'org.quartz.core.QuartzScheduler' - running locally.
  NOT STARTED.
  Currently in standby mode.
  Number of jobs executed: 0
  Using thread pool 'org.quartz.simpl.SimpleThreadPool' - with 10 threads.
  Using job-store 'org.quartz.simpl.RAMJobStore' - which does not support persistence. and is not clustered.

18:49:37.326 [main] INFO  org.quartz.impl.StdSchedulerFactory - Quartz scheduler 'DefaultQuartzScheduler' initialized from default resource file in Quartz package: 'quartz.properties'
18:49:37.326 [main] INFO  org.quartz.impl.StdSchedulerFactory - Quartz scheduler version: 2.2.3
18:49:37.326 [main] INFO  org.quartz.core.QuartzScheduler - Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED started.
18:49:37.342 [main] INFO  org.dapnet.core.scheduler.SchedulerManager - SchedulerManager successfully started
18:49:37.342 [main] INFO  org.dapnet.core.DAPNETCore - Starting RestManager
18:49:38.189 [main] INFO  org.dapnet.core.rest.RestManager - RestApi successfully started.
18:49:38.195 [main] INFO  org.dapnet.core.DAPNETCore - Starting Transmitter Server
18:49:38.274 [main] INFO  org.dapnet.core.transmission.TransmitterServer - Server started on port: 43434
18:49:38.274 [main] INFO  org.dapnet.core.DAPNETCore - DAPNETCore started
Killed
Taronyu commented 7 years ago

Das "Killed" kommt nicht aus dem Core, das sieht mir nach dem OS aus. Kann es vielleicht sein, dass der Speicher (Mem/Swap) voll ist und deswegen der Prozess gekillt wird? Eventuell mal die Java Memory Limits hochschrauben (`-Xms256M -Xmx512M').

untergrundbiber commented 7 years ago

Tatsache das war wohl der Fehler, habe dem Container mehr RAM zugeteilt und wie von die vorgeschlagen den Java-Memorypool hoch gesetzt und es läuft jetzt. Ist mir schon ein bisschen peinlich das ich nicht selber drauf gekommen bin mal kurz in htop zu gucken :smile:

Taronyu commented 7 years ago

Kein Problem. Schön, dass es jetzt funktioniert.