Closed shirleyStorage closed 1 week ago
To disable the legacy master-worker mechanism in a multi-server setup using Apache HugeGraph, you need to set the server.role
configuration option to "computer". This can be done in your configuration file as follows:
server.role=computer
This setting ensures that the RoleElectionStateMachine
is not initialized, effectively disabling the legacy master-worker mechanism. Here is the relevant configuration option:
public static final ConfigOption<String> SERVER_ROLE =
new ConfigOption<>(
"server.role",
"The role of nodes in the cluster, available types are " +
"[master, worker, computer]",
allowValues("master", "worker", "computer"),
"master"
);
By setting server.role
to "computer", the supportRoleElection
will be false, and the role state machine will not be initialized [1][2].
To continue talking to Dosu, mention @dosu.
To disable the legacy master-worker mechanism in a multi-server setup using Apache HugeGraph, you need to set the
server.role
configuration option to "computer". This can be done in your configuration file as follows:server.role=computer
This setting ensures that the
RoleElectionStateMachine
is not initialized, effectively disabling the legacy master-worker mechanism. Here is the relevant configuration option:public static final ConfigOption<String> SERVER_ROLE = new ConfigOption<>( "server.role", "The role of nodes in the cluster, available types are " + "[master, worker, computer]", allowValues("master", "worker", "computer"), "master" );
By setting
server.role
to "computer", thesupportRoleElection
will be false, and the role state machine will not be initialized [1][2].To continue talking to Dosu, mention @dosu.
@dosu Thanks for the direct solution, could u also provide more details about the legacy master-worker role code design & usage (As detailed and systematic as possible)?
The legacy master-worker role code design in Apache HugeGraph is implemented using an enumeration called NodeRole
. This enum defines three roles: MASTER
, WORKER
, and COMPUTER
. Each role is associated with a unique byte code and a name. The NodeRole
enum also includes methods to check the role type and retrieve the role's code and name.
Here is the relevant code:
public enum NodeRole implements SerialEnum {
MASTER(1, "master"),
WORKER(2, "worker"),
COMPUTER(3, "computer");
private final byte code;
private final String name;
NodeRole(int code, String name) {
assert code < 256;
this.code = (byte) code;
this.name = name;
}
static {
SerialEnum.register(NodeRole.class);
}
@Override
public byte code() {
return this.code;
}
public String string() {
return this.name;
}
public boolean master() {
return this == MASTER;
}
public boolean worker() {
return this == WORKER;
}
public boolean computer() {
return this == COMPUTER;
}
}
Roles Defined:
MASTER
: Represents the master node with code 1
and name "master"
.WORKER
: Represents the worker node with code 2
and name "worker"
.COMPUTER
: Represents the computer node with code 3
and name "computer"
.Methods:
code()
: Returns the byte code of the role.string()
: Returns the name of the role.master()
: Checks if the role is MASTER
.worker()
: Checks if the role is WORKER
.computer()
: Checks if the role is COMPUTER
.Additionally, the RoleListener
interface in the org.apache.hugegraph.masterelection
package defines methods for handling role changes and errors within the system. This interface includes methods such as onAsRoleMaster
, onAsRoleWorker
, onAsRoleCandidate
, unknown
, onAsRoleAbdication
, and error
, each taking a StateMachineContext
parameter to manage the state transitions and handle errors.
Here is the relevant code for the RoleListener
interface:
package org.apache.hugegraph.masterelection;
public interface RoleListener {
void onAsRoleMaster(StateMachineContext context);
void onAsRoleWorker(StateMachineContext context);
void onAsRoleCandidate(StateMachineContext context);
void unknown(StateMachineContext context);
void onAsRoleAbdication(StateMachineContext context);
void error(StateMachineContext context, Throwable e);
}
The RoleElectionStateMachineTest
class provides a test implementation for the role election state machine, demonstrating how roles transition and how the system logs these transitions. The LogEntry
class within this test captures the epoch, node, and role for each state change, and the TestConfig
class provides configuration details for the state machine.
Here is an excerpt from the RoleElectionStateMachineTest
class:
public class RoleElectionStateMachineTest {
private static class LogEntry {
private final Integer epoch;
private final String node;
private final Role role;
enum Role {
master,
worker,
candidate,
abdication,
unknown
}
public LogEntry(Integer epoch, String node, Role role) {
this.epoch = epoch;
this.node = node;
this.role = role;
}
@Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (!(obj instanceof LogEntry)) {
return false;
}
LogEntry logEntry = (LogEntry) obj;
return Objects.equals(this.epoch, logEntry.epoch) &&
Objects.equals(this.node, logEntry.node) &&
this.role == logEntry.role;
}
@Override
public int hashCode() {
return Objects.hash(this.epoch, this.node, this.role);
}
@Override
public String toString() {
return "LogEntry{" +
"epoch=" + this.epoch +
", node='" + this.node + '\'' +
", role=" + this.role +
'}';
}
}
private static class TestConfig implements Config {
private final String node;
public TestConfig(String node) {
this.node = node;
}
@Override
public String node() {
return this.node;
}
@Override
public String url() {
return "http://127.0.0.1:8080";
}
@Override
public int exceedsFailCount() {
return 2;
}
@Override
public long randomTimeoutMillisecond() {
return 400;
}
@Override
public long heartBeatIntervalSecond() {
return 1;
}
@Override
public int masterDeadTimes() {
return 5;
}
@Override
public long baseTimeoutMillisecond() {
return 100;
}
}
@Test
public void testStateMachine() throws InterruptedException {
final int MAX_COUNT = 200;
CountDownLatch stop = new CountDownLatch(4);
List<LogEntry> logRecords = Collections.synchronizedList(new ArrayList<>(MAX_COUNT));
List<String> masterNodes = Collections.synchronizedList(new ArrayList<>(MAX_COUNT));
RoleListener callback = new RoleListener() {
@Override
public void onAsRoleMaster(StateMachineContext context) {
Integer epochId = context.epoch();
String node = context.node();
logRecords.add(new LogEntry(epochId, node, LogEntry.Role.master));
if (logRecords.size() > MAX_COUNT) {
context.stateMachine().shutdown();
}
Utils.println("master node: " + node);
masterNodes.add(node);
}
@Override
public void onAsRoleWorker(StateMachineContext context) {
Integer epochId = context.epoch();
String node = context.node();
logRecords.add(new LogEntry(epochId, node, LogEntry.Role.worker));
if (logRecords.size() > MAX_COUNT) {
context.stateMachine().shutdown();
}
}
@Override
public void onAsRoleCandidate(StateMachineContext context) {
Integer epochId = context.epoch();
String node = context.node();
logRecords.add(new LogEntry(epochId, node, LogEntry.Role.candidate));
if (logRecords.size() > MAX_COUNT) {
context.stateMachine().shutdown();
}
}
};
}
}
This design allows for easy role identification, role-specific logic implementation, and handling of role transitions and errors within the HugeGraph system [1][2][3].
To continue talking to Dosu, mention @dosu.
@dosu What do the three server.role configurations mean?
The three server.role
configurations define the role of nodes in the HugeGraph cluster. The available types are:
Here is the relevant configuration option:
public static final ConfigOption<String> SERVER_ROLE =
new ConfigOption<>(
"server.role",
"The role of nodes in the cluster, available types are " +
"[master, worker, computer]",
allowValues("master", "worker", "computer"),
"master"
);
By setting the server.role
configuration, you can define the specific role of each node in your HugeGraph cluster [1][2].
To continue talking to Dosu, mention @dosu.
@dosu How would you start a minimal cluster with each of the three server.role values?
To start a minimal cluster with each of the three server.role
values (master
, worker
, computer
), you need to configure the server.role
option in the rest-server.properties
configuration file for each server instance. Here is an example of how you can set up the configuration for each role:
Master Node Configuration:
server.id=server-master
server.role=master
restserver.url=http://127.0.0.1:8080
Worker Node Configuration:
server.id=server-worker
server.role=worker
restserver.url=http://127.0.0.1:8081
Computer Node Configuration:
server.id=server-computer
server.role=computer
restserver.url=http://127.0.0.1:8082
Each configuration file should be saved separately and used to start the respective server instance. The server.id
and restserver.url
values should be unique for each server to avoid conflicts.
Refer to the rest-server.properties
configuration file for additional options that may be relevant to your setup, such as ssl.keystore_file
, restserver.max_worker_threads
, and gremlinserver.url
[1][2][3].
To continue talking to Dosu, mention @dosu.
@dosu How to start a single cluster with backend=rocksdb?
To start a single HugeGraph cluster with backend=rocksdb
, follow these steps:
Update hugegraph.properties
:
backend=rocksdb
serializer=binary
rocksdb.data_path=.
rocksdb.wal_path=.
Initialize the database (required on first startup or if a new configuration was manually added under 'conf/graphs/'):
cd *hugegraph-${version}
bin/init-store.sh
Update rest-server.properties
:
restserver.url=http://127.0.0.1:8080
gremlinserver.url=http://127.0.0.1:8182
graphs=[hugegraph:conf/hugegraph.properties]
Update gremlin-server.yaml
:
host: 127.0.0.1
port: 8182
graphs: {
hugegraph: conf/hugegraph.properties
}
Start the server:
bin/start-hugegraph.sh
Starting HugeGraphServer...
Connecting to HugeGraphServer (http://127.0.0.1:8080/graphs)....OK
Verify the server and graphs:
curl http://127.0.0.1:8080/graphs/
This will list the available graphs, including the one configured with RocksDB.
This will start the HugeGraph server with RocksDB as the backend [1][2][3].
To continue talking to Dosu, mention @dosu.
@dosu What is the recommended jdk version for compiling hugegraph?
The recommended JDK version for compiling HugeGraph is JDK 1.8.0_111. However, for running HugeGraph-Server, it is recommended to use Java 11 (JDK 11) to ensure better security, especially in production environments or those exposed to the public network [1][2][3].
To continue talking to Dosu, mention @dosu.
@dosu How to run standalone mode in docker after hugegraph is built locally?
To run HugeGraph in standalone mode using Docker after building it locally, follow these steps:
Build the Docker image:
Ensure you are in the directory containing the Dockerfile
and run the following command to build the Docker image:
docker build -t hugegraph-pd .
Run the Docker container: Once the image is built, you can run the container in standalone mode using the following command:
docker run -itd --name=graph -p 8080:8080 hugegraph-pd
This command will start the HugeGraph server in a Docker container, exposing port 8080 for access. The docker-entrypoint.sh
script will be executed as specified in the Dockerfile
.
Alternatively, you can use Docker Compose to manage the deployment:
Create a docker-compose.yaml
file with the following content:
version: '3'
services:
graph:
image: hugegraph-pd
ports:
- 8080:8080
Run the following command to start the HugeGraph server:
docker-compose up -d
This will start the HugeGraph server and manage other HugeGraph-related instances if needed [1][2].
To continue talking to Dosu, mention @dosu.
@dosu Why use hugegraph-pd instead of hugegraph-server?
@dosu But I only need hugegraph server to run standalone hugegraph cluster locally?
To run a standalone HugeGraph server locally, you can use the hugegraph
Docker image. Here is the command to start a HugeGraph server with RocksDB as the backend:
docker run -itd --name=graph -p 8080:8080 hugegraph/hugegraph
This command will start the HugeGraph server in the background, and you can access it at http://127.0.0.1:8080
[1].
If you prefer to use a startup script, follow these steps:
Update the hugegraph.properties
file with the desired backend configuration. For example, for RocksDB:
backend=rocksdb
serializer=binary
rocksdb.data_path=.
rocksdb.wal_path=.
Initialize the database (required on first startup):
cd *hugegraph-${version}
bin/init-store.sh
Start the server:
bin/start-hugegraph.sh
This will start the HugeGraph server and connect it to the configured backend storage [1].
To continue talking to Dosu, mention @dosu.
@dosu How do I make a hugegraph/hugegraph docker image from my own hugegraph source code?
To create a hugegraph/hugegraph
Docker image from your own HugeGraph source code, follow these steps:
Ensure you have Docker and Maven installed on your system.
Navigate to the root directory of your HugeGraph source code.
Create a Dockerfile in the root directory with the following content:
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Dockerfile for HugeGraph Server
# 1st stage: build source code
FROM maven:3.9.0-eclipse-temurin-11 AS build
COPY . /pkg
WORKDIR /pkg
ARG MAVEN_ARGS
RUN mvn package $MAVEN_ARGS -e -B -ntp -Dmaven.test.skip=true -Dmaven.javadoc.skip=true && pwd && ls -l && rm \
./hugegraph-server/*.tar.gz && rm ./hugegraph-pd/*.tar.gz && rm ./hugegraph-store/*.tar.gz
# 2nd stage: runtime env
# Note: ZGC (The Z Garbage Collector) is only supported on ARM-Mac with java > 13
FROM openjdk:11-slim
COPY --from=build /pkg/hugegraph-server/apache-hugegraph-server-incubating-*/ /hugegraph-server/
LABEL maintainer="HugeGraph Docker Maintainers <dev@hugegraph.apache.org>"
# TODO: use g1gc or zgc as default
ENV JAVA_OPTS="-XX:+UnlockExperimentalVMOptions -XX:+UseContainerSupport -XX:MaxRAMPercentage=50 -XshowSettings:vm" \
HUGEGRAPH_HOME="hugegraph-server"
#COPY . /hugegraph/hugegraph-server
WORKDIR /hugegraph-server/
# 1. Install environment and init HugeGraph Sever
RUN set -x \
&& apt-get -q update \
&& apt-get -q install -y --no-install-recommends --no-install-suggests \
dumb-init \
procps \
curl \
lsof \
vim \
cron \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* \
&& service cron start \
&& pwd && cd /hugegraph-server/ \
&& sed -i "s/^restserver.url.*$/restserver.url=http:\/\/0.0.0.0:8080/g" ./conf/rest-server.properties
# 2. Init docker script
COPY hugegraph-server/hugegraph-dist/docker/scripts/remote-connect.groovy ./scripts
COPY hugegraph-server/hugegraph-dist/docker/scripts/detect-storage.groovy ./scripts
COPY hugegraph-server/hugegraph-dist/docker/docker-entrypoint.sh .
RUN chmod 755 ./docker-entrypoint.sh
EXPOSE 8080
VOLUME /hugegraph-server
ENTRYPOINT ["/usr/bin/dumb-init", "--"]
CMD ["./docker-entrypoint.sh"]
Build the Docker image using the following command:
docker build -t hugegraph/hugegraph .
This command will create a Docker image named hugegraph/hugegraph
from your HugeGraph source code [1][2].
To continue talking to Dosu, mention @dosu.
@dosu How do I make a hugegraph/hugegraph docker image from my own hugegraph source code?only for backend=rocksdb and start a standalone cluster
To make a hugegraph/hugegraph
Docker image from your own HugeGraph source code for backend=rocksdb and start a standalone cluster, follow these steps:
Build the Docker Image:
First, you need to build the Docker image from your HugeGraph source code. Assuming you have Docker installed and your HugeGraph source code is ready, create a Dockerfile
in the root of your HugeGraph source directory with the following content:
FROM openjdk:8-jre
# Set the working directory
WORKDIR /hugegraph
# Copy the HugeGraph source code into the container
COPY . /hugegraph
# Install necessary dependencies and build HugeGraph
RUN ./build.sh
# Expose the port
EXPOSE 8080
# Set the entrypoint
ENTRYPOINT ["./bin/start-hugegraph.sh"]
Then, build the Docker image using the following command:
docker build -t hugegraph/hugegraph .
Start the Standalone Cluster:
You can start the HugeGraph server using Docker with RocksDB as the backend. Use the following docker run
command to start the server:
docker run -itd --name=graph -p 8080:8080 hugegraph/hugegraph
Optional:
docker exec -it graph bash
to enter the container to perform operations.docker run -itd --name=graph -p 8080:8080 -e PRELOAD="true" hugegraph/hugegraph
to start with a built-in example graph. You can use the RESTful API to verify the result.-e PASSWORD=123456
to enable auth mode and set the password for admin.Alternatively, you can use Docker Compose to manage the deployment. Create a docker-compose.yaml
file with the following content:
version: '3'
services:
graph:
image: hugegraph/hugegraph
ports:
- 8080:8080
Then, start the server using Docker Compose:
docker-compose up -d
This will start a standalone HugeGraph server with RocksDB as the backend [1][2].
To continue talking to Dosu, mention @dosu.
@dosu I ran into the following problem when restarting container after compiling with the latest master branch code:
java.lang.IllegalArgumentException: The server with name 'server-1' already in cluster
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:167) ~[guava-31.0.1-android.jar:?]
at org.apache.hugegraph.util.E.checkArgument(E.java:52) ~[hugegraph-common-1.5.0.jar:1.5.0]
at org.apache.hugegraph.task.ServerInfoManager.initServerInfo(ServerInfoManager.java:110) ~[hugegraph-core-1.5.0.jar:1.5.0]
at org.apache.hugegraph.StandardHugeGraph.serverStarted(StandardHugeGraph.java:288) ~[hugegraph-core-1.5.0.jar:1.5.0]
at org.apache.hugegraph.core.GraphManager.serverStarted(GraphManager.java:484) ~[hugegraph-api-1.5.0.jar:0.71.0.0]
at org.apache.hugegraph.core.GraphManager.init(GraphManager.java:125) ~[hugegraph-api-1.5.0.jar:0.71.0.0]
at org.apache.hugegraph.server.ApplicationConfig$GraphManagerFactory$1.onEvent(ApplicationConfig.java:143) ~[hugegraph-api-1.5.0.jar:0.71.0.0]
at org.glassfish.jersey.server.internal.monitoring.CompositeApplicationEventListener.onEvent(CompositeApplicationEventListener.java:49) ~[jersey-server-3.0.3.jar:?]
at org.glassfish.jersey.server.internal.monitoring.MonitoringContainerListener.onStartup(MonitoringContainerListener.java:56) ~[jersey-server-3.0.3.jar:?]
at org.glassfish.jersey.server.ApplicationHandler.onStartup(ApplicationHandler.java:711) ~[jersey-server-3.0.3.jar:?]
at org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.start(GrizzlyHttpContainer.java:330) ~[jersey-container-grizzly2-http-3.0.3.jar:?]
at org.glassfish.grizzly.http.server.HttpHandlerChain.start(HttpHandlerChain.java:376) ~[grizzly-http-server-3.0.1.jar:3.0.1]
at org.glassfish.grizzly.http.server.HttpServer.setupHttpHandler(HttpServer.java:268) ~[grizzly-http-server-3.0.1.jar:3.0.1]
at org.glassfish.grizzly.http.server.HttpServer.start(HttpServer.java:245) ~[grizzly-http-server-3.0.1.jar:3.0.1]
at org.apache.hugegraph.server.RestServer.start(RestServer.java:70) ~[hugegraph-api-1.5.0.jar:0.71.0.0]
at org.apache.hugegraph.server.RestServer.start(RestServer.java:177) ~[hugegraph-api-1.5.0.jar:0.71.0.0]
at org.apache.hugegraph.dist.HugeRestServer.start(HugeRestServer.java:31) ~[hugegraph-dist-1.5.0.jar:1.5.0]
at org.apache.hugegraph.dist.HugeGraphServer.<init>(HugeGraphServer.java:60) ~[hugegraph-dist-1.5.0.jar:1.5.0]
at org.apache.hugegraph.dist.HugeGraphServer.main(HugeGraphServer.java:120) ~[hugegraph-dist-1.5.0.jar:1.5.0]
To resolve the java.lang.IllegalArgumentException: The server with name 'server-1' already in cluster
error when restarting a HugeGraph container, you need to ensure that the server information is properly removed before restarting. The ServerInfoManager
class has a method removeSelfServerInfo
that can be used to remove the server information.
Here is a code snippet that demonstrates how to remove the server information before restarting the container:
public synchronized boolean close() {
this.closed = true;
if (!this.dbExecutor.isShutdown()) {
this.removeSelfServerInfo();
this.call(() -> {
try {
this.tx().close();
} catch (ConnectionException ignored) {
// ConnectionException means no connection established
}
this.graph.closeTx();
return null;
});
}
return true;
}
Additionally, ensure that the server.id
in your rest-server.properties
configuration file is unique for each server in the cluster. The default value is server-1
, and if multiple servers use the same ID, it will cause conflicts. You can change the server.id
to a unique value for each server:
server.id=unique-server-id
Ensure that this method is called before the container is restarted to remove the server information and avoid the IllegalArgumentException
. Also, verify that the server.id
is unique across your cluster configuration.
Refer to the HugeGraph configuration documentation for more details on the rest-server.properties
file and other configuration options:
# rest-server.properties
server.id=server-1
server.role=master
restserver.url=http://127.0.0.1:8080
# Other configurations...
By following these steps, you can avoid conflicts and ensure a smooth restart of your HugeGraph container [1][2][3].
To continue talking to Dosu, mention @dosu.
@dosu If the server is killed suddenly. How to execute removeSelfServerInfo? This ServerInfo will always be there?
To ensure removeSelfServerInfo
is executed when the server is killed suddenly, you should handle the server shutdown gracefully by ensuring that the close
method is called. The close
method in ServerInfoManager
already calls removeSelfServerInfo
if the dbExecutor
is not shut down.
Here is the relevant part of the close
method:
public synchronized boolean close() {
this.closed = true;
if (!this.dbExecutor.isShutdown()) {
this.removeSelfServerInfo();
this.call(() -> {
try {
this.tx().close();
} catch (ConnectionException ignored) {
// ConnectionException means no connection established
}
this.graph.closeTx();
return null;
});
}
return true;
}
To handle sudden server termination, you can use a shutdown hook in your application to ensure that the close
method is called:
Runtime.getRuntime().addShutdownHook(new Thread(() -> {
try {
serverInfoManager.close();
} catch (Exception e) {
e.printStackTrace();
}
}));
This shutdown hook will be executed when the JVM is shutting down, ensuring that removeSelfServerInfo
is called to remove the ServerInfo [1].
To continue talking to Dosu, mention @dosu.
@dosu docker run a 'backend=rocksdb ,standalone' cluster
The error "The server with name 'server-1' already in cluster" on the first restart but not on the second restart is likely due to the server.id
configuration in the rest-server.properties
file. The server.id
is set to server-1
, and it seems that on the first restart, the server might not have fully released the previous instance, causing a conflict. By the second restart, the previous instance is fully terminated, allowing the new instance to start without issues.
To avoid this, ensure that the server is fully stopped before restarting it. You can also try changing the server.id
to a unique value if running multiple instances.
Here is the relevant configuration from rest-server.properties
:
server.id=server-1
server.role=master
Additionally, ensure that the restserver.url
and gremlinserver.url
configurations are consistent with the host and port settings in gremlin-server.yaml
to avoid any potential conflicts during the server startup process.
For example, in rest-server.properties
:
restserver.url=http://127.0.0.1:8080
gremlinserver.url=http://127.0.0.1:8182
And in gremlin-server.yaml
:
host: 127.0.0.1
port: 8182
Moreover, the issue might also be related to how RocksDB handles its sessions and locks. According to the RocksDBStore
class in HugeGraph, the database sessions are managed and opened using a thread pool. If the previous instance of RocksDB has not fully released its locks or sessions, it could cause the "already in cluster" error. The second restart works because the previous instance has had enough time to release all resources.
To mitigate this, ensure that the RocksDB instance is properly closed before restarting the server. You can also check the logs for any issues related to session handling or lock availability in RocksDB [1][2][3].
To continue talking to Dosu, mention @dosu.
@dosu Can you give an example to simulate this situation
To simulate the situation where restarting a HugeGraph container results in the error "The server with name 'server-1' already in cluster", you can follow these steps:
Initial Setup: Start a HugeGraph container with the default configuration where server.id
is set to server-1
.
Restart the Container: Restart the HugeGraph container without changing the server.id
.
Here is an example using Docker to simulate this:
# Step 1: Start the HugeGraph container
docker run -d --name hugegraph -p 8080:8080 -p 8182:8182 hugegraph/hugegraph
# Step 2: Restart the HugeGraph container
docker restart hugegraph
The configuration file rest-server.properties
contains the following line which sets the server ID:
server.id=server-1
When the container restarts, it will attempt to rejoin the cluster with the same server.id
, leading to the error "The server with name 'server-1' already in cluster".
According to the HugeGraph documentation, the server.id
is used for license verification and must be unique within the cluster. This is why restarting the container with the same server.id
causes the conflict.
Additionally, you can use docker-compose
to manage HugeGraph and related instances. Here is an example docker-compose.yml
:
version: '3'
services:
server:
image: hugegraph/hugegraph
container_name: server
ports:
- 8080:8080
To start the services, use:
docker-compose up -d
This setup ensures that the server configuration is managed consistently, and any restart will follow the same configuration, potentially leading to the same error if server.id
is not unique [1][2][3].
To continue talking to Dosu, mention @dosu.
@dosu If the physical machine suddenly goes down, this removeSelfServerInfo will not be called and the serverinfo will stay there forever, right?
Yes, if the physical machine suddenly goes down, the removeSelfServerInfo
method will not be called, and the server info will stay there indefinitely. This is because the removeSelfServerInfo
method is called during the close
method, which is executed when the server is shutting down gracefully. If the machine goes down unexpectedly, this method will not be invoked.
public synchronized boolean close() {
this.closed = true;
if (!this.dbExecutor.isShutdown()) {
this.removeSelfServerInfo();
this.call(() -> {
try {
this.tx().close();
} catch (ConnectionException ignored) {
// ConnectionException means no connection established
}
this.graph.closeTx();
return null;
});
}
return true;
}
The removeSelfServerInfo
method is responsible for removing the server info, and it is only called within the close
method. Therefore, if the machine goes down suddenly, the server info will not be removed [1].
To continue talking to Dosu, mention @dosu.
@dosu If this is the case, how do I clear serverInfo manually?
To clear serverInfo
manually if the server goes down unexpectedly, you can use the removeServerInfo
method provided in the ServerInfoManager
class. This method removes the server information for a given server ID.
Here is how you can do it:
// Assuming you have an instance of ServerInfoManager
ServerInfoManager serverInfoManager = ...;
// The ID of the server you want to remove
Id serverId = ...;
// Remove the server info
serverInfoManager.removeServerInfo(serverId);
The removeServerInfo
method will handle the removal of the server information from the backend store.
Supporting evidence from the provided code:
private HugeServerInfo removeServerInfo(Id serverId) {
if (serverId == null) {
return null;
}
LOG.info("Remove server info: {}", serverId);
return this.call(() -> {
Iterator<Vertex> vertices = this.tx().queryServerInfos(serverId);
Vertex vertex = QueryResults.one(vertices);
if (vertex == null) {
return null;
}
this.tx().removeVertex((HugeVertex) vertex);
return HugeServerInfo.fromVertex(vertex);
});
}
This method ensures that the server information is properly removed from the backend store [1].
To continue talking to Dosu, mention @dosu.
@dosu If I set server.role=computer, does it not check if serverinfo exists?
If you set server.role=computer
, the system will still check if serverinfo
exists. The heartbeat
method in the ServerInfoManager
class explicitly checks for the existence of serverinfo
regardless of the server role:
public synchronized void heartbeat() {
assert this.graphIsReady();
HugeServerInfo serverInfo = this.selfServerInfo();
if (serverInfo != null) {
// Update heartbeat time for this server
serverInfo.updateTime(DateUtil.now());
this.save(serverInfo);
return;
}
/* ServerInfo is missing */
if (this.selfNodeId() == null) {
// Ignore if ServerInfo is not initialized
LOG.info("ServerInfo is missing: {}, may not be initialized yet");
return;
}
if (this.selfIsMaster()) {
// On master node, just wait for ServerInfo re-init
LOG.warn("ServerInfo is missing: {}, may be cleared before",
this.selfNodeId());
return;
}
/*
* Missing server info on non-master node, may be caused by graph
* truncated on master node then synced by raft.
* TODO: we just patch it here currently, to be improved.
*/
serverInfo = this.saveServerInfo(this.selfNodeId(), this.selfNodeRole());
assert serverInfo != null;
}
This method checks if serverInfo
is null
and handles the situation accordingly, indicating that the existence of serverinfo
is always checked [1].
To continue talking to Dosu, mention @dosu.
@dosu Does it still trigger The server with name 'server-1' already in cluster
Yes, setting server.role=computer
does not prevent the error "The server with name 'server-1' already in cluster" from being triggered. The system still checks for the existence of serverinfo
regardless of the server role. If the server info for server-1
already exists in the cluster, you will encounter this error.
To avoid this issue, ensure that each server in the cluster has a unique server.id
in the rest-server.properties
configuration file:
server.id=unique-server-id
server.role=computer
This will prevent conflicts and ensure that each server can join the cluster without triggering the error [1].
To continue talking to Dosu, mention @dosu.
@dosu What is the difference between hugegraph using master-worker pattern and backend=hstore pattern?
The master-worker pattern and the backend=hstore
pattern in HugeGraph serve different purposes and are implemented differently. Here are the key differences:
Purpose:
Components:
Implementation:
Role Management:
NodeRole
enum, which defines roles like MASTER
, WORKER
, and COMPUTER
.RoleListener
interface and RoleElectionStateMachine
[3].Purpose:
backend=hstore
pattern is used for data storage and retrieval in HugeGraph. It defines how data is stored, queried, and managed in the Hstore backend.Components:
Implementation:
Data Management:
Each pattern addresses different aspects of HugeGraph's functionality, with the master-worker pattern handling task management and the backend=hstore
pattern handling data storage.
To continue talking to Dosu, mention @dosu.
@dosu If disable the master-worker pattern, are there other implementations that will work instead?
Yes, if you disable the master-worker pattern in HugeGraph, there are alternative task scheduling and execution mechanisms available:
StandardTaskScheduler with a Single Node Setup:
if (this.serverManager().onlySingleNode() && !task.computer()) {
task.status(TaskStatus.QUEUED);
task.server(this.serverManager().selfNodeId());
this.save(task);
return this.submitTask(task);
}
Local Task Scheduler:
TaskScheduler scheduler = new StandardTaskScheduler(
graph,
this.taskExecutor,
this.taskDbExecutor,
this.serverInfoDbExecutor
);
this.schedulers.put(graph, scheduler);
DistributedTaskScheduler:
Example:
public DistributedTaskScheduler(HugeGraphParams graph,
ScheduledThreadPoolExecutor schedulerExecutor,
ExecutorService taskDbExecutor,
ExecutorService schemaTaskExecutor,
ExecutorService olapTaskExecutor,
ExecutorService gremlinTaskExecutor,
ExecutorService ephemeralTaskExecutor,
ExecutorService serverInfoDbExecutor) {
super(graph, serverInfoDbExecutor);
this.taskDbExecutor = taskDbExecutor;
this.schemaTaskExecutor = schemaTaskExecutor;
this.olapTaskExecutor = olapTaskExecutor;
this.gremlinTaskExecutor = gremlinTaskExecutor;
this.ephemeralTaskExecutor = ephemeralTaskExecutor;
this.schedulerExecutor = schedulerExecutor;
this.closed.set(false);
this.schedulePeriod = this.graph.configuration()
.get(CoreOptions.TASK_SCHEDULE_PERIOD);
this.cronFuture = this.schedulerExecutor.scheduleWithFixedDelay(
() -> {
LockUtil.lock("", LockUtil.GRAPH_LOCK);
try {
this.cronSchedule();
} catch (Throwable t) {
LOG.info("cronScheduler exception graph: {}", this.graphName(), t);
} finally {
LockUtil.unlock("", LockUtil.GRAPH_LOCK);
}
},
10L, schedulePeriod,
TimeUnit.SECONDS);
}
These mechanisms ensure that tasks are handled efficiently even when the master-worker pattern is not in use [1][2][3].
To continue talking to Dosu, mention @dosu.
@dosu If I set server.role=computer,and use backend=rocksdb , start a standalone cluster.
Then request curl -DELETE "http://localhost:8080/graphs/policy/schema/vertexlabels/person
it will return "can't schedule task on non-master server".how to resolve it?
To resolve the error "can't schedule task on non-master server" in a standalone HugeGraph cluster, you need to set the server as the master. This error occurs because tasks can only be scheduled on the master server.
Here's how to fix it:
server.role=master
in your rest-server.properties
configuration file.server.role=master
This will allow task scheduling operations to proceed without encountering the error [1].
To continue talking to Dosu, mention @dosu.
@dosu
I want to deprecated server.role=master
, so I can't set server.role=master
.I have to set server.role=computer,
, then I set backend=rocksdb , start a standalone cluster.
To request curl -DELETE "http://localhost:8080/graphs/policy/schema/vertexlabels/person
it will return "can't schedule task on non-master server".
How can I fix it by improving the code?
To allow task scheduling on a server with server.role=computer
in a standalone HugeGraph cluster using backend=rocksdb
, you need to modify the code to permit scheduling on a computer node. Here's how you can do it:
Modify the Task Scheduling Check: Update the logic in the StandardTaskScheduler
class to allow scheduling on a computer node.
Locate the schedule
method and modify the master node check to include the computer role:
// Check this is on master or computer for normal task schedule
if (!this.serverManager().selfIsMaster() && !this.serverManager().selfIsComputer()) {
throw new HugeException("Can't schedule task on non-master and non-computer server");
}
Ensure Role Checking Method Exists: Make sure the ServerInfoManager
class has a method selfIsComputer()
to verify if the current server role is computer
.
Example implementation:
public boolean selfIsComputer() {
return this.selfNodeRole() != null && this.selfNodeRole().computer();
}
By making these changes, the system will allow task scheduling on a server with the computer
role, resolving the error you encountered [1][2].
To continue talking to Dosu, mention @dosu.
@dosu How do I know if my cluster is standalone in the code?
To determine if your HugeGraph cluster is running in standalone mode, you can check the configuration and code logic related to task scheduling and server roles.
Configuration Check:
rest-server.properties
file, ensure that the server.role
is set to a role that supports standalone operation, such as computer
or master
for a single-node setup.Code Logic:
TaskManager
class, standalone mode is typically indicated by the use of StandardTaskScheduler
instead of DistributedTaskScheduler
. This is determined by the schedulerType
in HugeGraphParams
.case "local":
default: {
TaskScheduler scheduler = new StandardTaskScheduler(
graph,
this.taskExecutor,
this.taskDbExecutor,
this.serverInfoDbExecutor
);
this.schedulers.put(graph, scheduler);
break;
}
Single Node Flag:
ServerInfoManager
class may use a flag like onlySingleNode
to indicate a single-node setup, affecting task scheduling.By checking these configurations and code paths, you can determine if your cluster is running in standalone mode [1].
To continue talking to Dosu, mention @dosu.
@dosu If I set backend=rocksdb, is there definitely only one server node?
Using backend=rocksdb
does not inherently limit you to a single server node. However, RocksDB is typically used in standalone setups due to its lack of built-in support for distributed storage. If you want to use multiple nodes, you would need to implement additional mechanisms for clustering, such as using Raft for consensus, which HugeGraph supports [1].
To continue talking to Dosu, mention @dosu.
Due to the lack of activity, the current issue is marked as stale and will be closed after 20 days, any update will remove the stale label
Problem Type (问题类型)
struct / logic (架构 / 逻辑设计问题)
Before submit
Environment (环境信息)
Your Question (问题描述)
How to disable legacy master-worker mechanism in multi-server
Vertex/Edge example (问题点 / 边数据举例)
No response
Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)
No response