Closed AndreMaz closed 2 years ago
Reopening as 0.14.20 does not fully solve the issue
Example that doesn't work with 0.14.20
const ServiceBroker = require("../src/service-broker");
const Middlewares = require("../src/middlewares");
const broker1 = new ServiceBroker({
nodeID: "node-1",
transporter: "Redis",
registry: {
discoverer: "Local"
// discoverer: "Redis"
// discoverer: "Etcd3"
}
});
const broker2 = new ServiceBroker({
nodeID: "node-2",
transporter: "Redis",
registry: {
discoverer: "Local"
// discoverer: "Redis",
// discoverer: "Etcd3"
}
});
const locationSchema = {
name: "location",
// depends on device.service at node-2
dependencies: ["device"],
async started() {
this.logger.info("Location Services started");
}
};
const tenantSchema = {
name: "tenant",
actions: {
add: {
handler(ctx) {
return "tenant.add action";
}
}
},
async started() {
this.logger.info("Tenant Services started");
}
};
const assetSchema = {
name: "device",
// Depends on tenant.service located at node-1
dependencies: ["tenant"],
async started() {
this.logger.info("Device Services started");
// tenant has started but node-1 does NOT accept action calls because location@node-1 has not started yet
const result = await this.broker.call("tenant.add");
this.logger.info("RESPONSE FROM TENANT =>", result);
}
};
// Place location.service and tenant.service at node-1
broker1.createService(locationSchema);
broker1.createService(tenantSchema);
// Place asset.service at node-2
broker2.createService(assetSchema);
Promise.all([broker1.start(), broker2.start()])
.then(() => {
broker1.repl();
})
.catch(err => console.log(err));
A deadlock might occur when services are deployed in mixed mode and there's a dependency between services scattered across several nodes.
Here's a simple repro example:
Explanation
node-1 has:
location.service which depends on device.service located at node-2
tenant.service which has no dependencies
node-2 has:
Issue
The issue is that node-1 emits the INFO packet only after all of the services that it manages has started. In this case it won't happen because the location.service is waiting for device.service@node-2 to start.
On the other hand, the device.service@node-2 can't start because the node-1 didn't emit the INFO about the fact that tenant.service has already started and only location.service is pending.
Possible solution
Service broker needs to broadcast INFO packets even while some of its services hasn't started yet.