kataras / go-websocket

:speaker: Deprecated. Use https://github.com/kataras/neffos instead
MIT License
59 stars 16 forks source link

Websocket hangs when calling iris.WebsocketConnection.Leave() on Disconnect (retested) #24

Closed antlaw closed 7 years ago

antlaw commented 7 years ago

platform: Windows 10

I retested the problem mentioned in the caption with the following program and found that the problem still exists in the current release and also v0.0.3. Actually the demo code below illustrates two problems. Here is a brief description of how to reproduce them:

  1. The source code below is a single program that can be used as client or server.
  2. To run as server, you may simply run: "go run main.go server". The server will receive connection and handle requests as it is. Here is a dump of the sample console message that the server will output: _C:\Users\User\Projects\Go\src\gp2\src\gp2\wsserver>go run main.go server
    |_   _|    (_)
      | |  ____ _  ___
      | | | __|| |/ __|
     _| |_| |  | |\__ \
    |_____|_|  |_||___/ 4

Mon, 23 Jan 2017 23:36:22 GMT: Running at 0.0.0.0:9090 socket.OnConnect() OnObjectUpdated() message:2;all;2.UsrSchedule_v1_1, time taken: 0s OnObjectUpdated() message:2;all;2.UsrSchedule_v1_1, time taken: 0s OnObjectUpdated() message:2;all;2.UsrSchedule_v1_1, time taken: 0s OnObjectUpdated() message:2;all;2.UsrSchedule_v1_1, time taken: 0s OnDisconnect(): client disconnected! socket.OnConnect() OnObjectUpdated() message:2;all;2.UsrSchedule_v1_1, time taken: 0s OnDisconnect(): client disconnected! socket.OnConnect() OnJoin() time taken: 0s OnObjectUpdated() message:2;all;2.UsrSchedule_v1_1, time taken: 0s OnObjectUpdated() message:2;all;2.UsrSchedule_v11, time taken: 0s OnDisconnect(): client disconnected!

  1. To run as client, you may simply run: "go run main.go client". It will connect to the server and keep sending and receiving messages. You can disconnect it after running for 10 seconds by killing this process. The client will output to console something like this:

C:\Users\User\Projects\Go\src\gp2\src\gp2\wsserver>go run main.go client objectupdate 1 2017-01-23 23:37:17.9047114 +0800 CST Received: go-websocket-message:objectupdate;0;2.UsrSchedule_v1_1.1 objectupdate 2 2017-01-23 23:37:19.9070407 +0800 CST Received: go-websocket-message:objectupdate;0;2.UsrSchedule_v1_1.2 exit status 2

  1. If you repeat step 3, everything is fine. I mean, the server is able to handle the new client connection without problem, because OnDisconnect will not call c.Leave()

  2. Uncomment the line 163: "// c.Leave("server2")" to "c.Leave("server2")" and run the server again. This time, the server will only be able to handle client request once. If the client disconnects, the server will still accept incoming connection, but will not be able to respond message anymore.

  3. Apart from the problem mentioned above, I also notice that the server will not respond to message that are sent immediately after connection established. Take a look the following function in main.go. If you comment the line 104 "time.Sleep(time.Second)", you will see that the server will not respond to the "join" request. Yet, messages after that are handled without problem.

    
    func ClientLoop() {
    for {
        time.Sleep(time.Second)
        err := ConnectWebSocket()
        if err != nil {
            fmt.Println("failed to connect websocket", err.Error())
            continue
        }
        time.Sleep(time.Second)
        err = SendMessage("2", "all", "join", "dummy2")
        go sendUntilErr(2)
        recvUntilErr()
        err = CloseWebSocket()
        if err != nil {
            fmt.Println("failed to close websocket", err.Error())
        }
    }

}



```go
// main.go: the testing program. This is line 1
package main

import (
    "fmt"
    "os"
    "strings"
    "time"

    "github.com/kataras/iris"

    "golang.org/x/net/websocket"
)

// WS is the current websocket connection
var WS *websocket.Conn

func main() {
    if len(os.Args) == 2 && strings.ToLower(os.Args[1]) == "server" {
        ServerLoop()
    } else if len(os.Args) == 2 && strings.ToLower(os.Args[1]) == "client" {
        ClientLoop()
    } else {
        fmt.Println("wsserver [server|client]")
    }
}

/////////////////////////////////////////////////////////////////////////
// client side
func sendUntilErr(sendInterval int) {
    i := 1
    for {
        time.Sleep(time.Duration(sendInterval) * time.Second)
        err := SendMessage("2", "all", "objectupdate", "2.UsrSchedule_v1_1")
        if err != nil {
            fmt.Println("failed to send join message", err.Error())
            return
        }
        fmt.Println("objectupdate", i)
        i++
    }
}

func recvUntilErr() {
    var msg = make([]byte, 2048)
    var n int
    var err error
    i := 1
    for {
        if n, err = WS.Read(msg); err != nil {
            fmt.Println(err.Error())
            return
        }
        fmt.Printf("%v Received: %s.%v\n", time.Now(), string(msg[:n]), i)
        i++
    }

}

//ConnectWebSocket connect a websocket to host
func ConnectWebSocket() error {
    var origin = "http://localhost/"
    var url = "ws://localhost:9090/socket"
    var err error
    WS, err = websocket.Dial(url, "", origin)
    return err
}

// CloseWebSocket closes the current websocket connection
func CloseWebSocket() error {
    if WS != nil {
        return WS.Close()
    }
    return nil
}

// SendMessage broadcast a message to server
func SendMessage(serverID, to, method, message string) error {
    buffer := []byte(message)
    return SendtBytes(serverID, to, method, buffer)
}

// SendtBytes broadcast a message to server
func SendtBytes(serverID, to, method string, message []byte) error {
    buffer := []byte(fmt.Sprintf("go-websocket-message:%v;0;%v;%v;", method, serverID, to))
    buffer = append(buffer, message...)
    _, err := WS.Write(buffer)
    if err != nil {
        fmt.Println(err)
        return err
    }
    return nil
}

// ClientLoop connects to websocket server, the keep send and recv dataS
func ClientLoop() {
    for {
        time.Sleep(time.Second)
        err := ConnectWebSocket()
        if err != nil {
            fmt.Println("failed to connect websocket", err.Error())
            continue
        }
        time.Sleep(time.Second)
        err = SendMessage("2", "all", "join", "dummy2")
        go sendUntilErr(2)
        recvUntilErr()
        err = CloseWebSocket()
        if err != nil {
            fmt.Println("failed to close websocket", err.Error())
        }
    }

}

/////////////////////////////////////////////////////////////////////////
// server side

// OnConnect handles incoming websocket connection
func OnConnect(c iris.WebsocketConnection) {
    fmt.Println("socket.OnConnect()")
    c.On("join", func(message string) { OnJoin(message, c) })
    c.On("objectupdate", func(message string) { OnObjectUpdated(message, c) })
    c.OnDisconnect(func() { OnDisconnect(c) })

}

// ServerLoop listen and serve websocket requests
func ServerLoop() {
    // // the path which the websocket client should listen/registed to ->
    iris.Config.Websocket.Endpoint = "/socket"
    iris.Websocket.OnConnection(OnConnect)
    iris.Listen("0.0.0.0:9090")

}

// OnJoin handles Join broadcast group request
func OnJoin(message string, c iris.WebsocketConnection) {
    t := time.Now()
    c.Join("server2")
    fmt.Println("OnJoin() time taken:", time.Since(t))
}

// OnObjectUpdated broadcasts to all client an incoming message
func OnObjectUpdated(message string, c iris.WebsocketConnection) {
    t := time.Now()
    s := strings.Split(message, ";")
    if len(s) != 3 {
        fmt.Println("OnObjectUpdated() invalid message format:" + message)
        return
    }
    serverID, _, objectID := s[0], s[1], s[2]
    err := c.To("server"+serverID).Emit("objectupdate", objectID)
    if err != nil {
        fmt.Println(err, "failed to broacast object")
        return
    }
    fmt.Println(fmt.Sprintf("OnObjectUpdated() message:%v, time taken: %v", message, time.Since(t)))
}

// OnDisconnect clean up things when a client is disconnected
func OnDisconnect(c iris.WebsocketConnection) {
    // c.Leave("server2")
    fmt.Println("OnDisconnect(): client disconnected!")

}
ghost commented 7 years ago

As we already noticed, the c.OnDisconnect is to release any external resources ( example: a list of connected clients) and not to call its internal methods, as c.Leave because it leaves from all connected rooms when is disconnected automatically. By the way I have tested the c.Leave on c.OnDisconnect and it seems to work (on browser as client) and we were able to send messages and put more clients online, so the term hanging on c.Leave inside c.OnDisconnect I think it's wrong, but I'm open to look it further if you can share us the code that you're using (the current code doesn't shows me anything and I can't reproduce your issue).

BUT, I will investigate that and probably will change the internal way of passing messages and events across the websocket server. I'll notify you when you will be able to re-run your code to verify the results :)

antlaw commented 7 years ago

Thanks for your prompt reply! Actually I wrote the source code above as simple tester to isolate the two problems without other library. (I just dunno why the tab formatting of it doesnt work here). I can reproduce the problem with this piece of source on windows 10. Yet, I think the problems may not appear in every platform, especially the one that causes message lost.

ghost commented 7 years ago

@antlaw On github you have to follow the markdown syntax, I am assuming that you are a fan of stackoverflow, right? :)

I converted the code on my computer but I am not pushing yet to the repository because I'm thinking that for new iris versions we should just use the socket.io instead of kataras/go-websocket which is only-websocket library. I created this library because those 'old' days (on iris v2 -> iris v5) no one else did a fasthttp supported websocket library, but now Iris is based on new net/http so why I should continue 're-create the world' while other libraries for websockets (and other communication protocols) exists already and they are working just fine (I have already reviewed golang's socket.io server-side library and I can tell that I approve its usage and its code) ?

antlaw commented 7 years ago

Pause the development of go-websocket is just fine, coz I agree with u that there is no point to re-invent the wheel. I, from a user point of view, concern about is how to make socket.io to work with iris. If there is an example to show a server can use IRIS to listen http and use socket.io for websocket at the same time, that should be good enough for us to migrate the existing codes.

(I just updated my comment above to format the source code.)

ghost commented 7 years ago

@antlaw You don't need a special example (Although I will add an example of this on iris-contrib/examples too*) of using socket.io with new Iris version, their websocket server is just exporting an http.Handler (as kataras/go-websocket does),

which can be registered using iris.ToHandler like other external net/http middleware that already exists , look:

package main

import (
    "log"
    "github.com/kataras/iris"
    "github.com/googollee/go-socket.io" // the socket.io lib
)

func main() {
    server, err := socketio.NewServer(nil)
    if err != nil {
        log.Fatal(err)
    }
    server.On("connection", func(so socketio.Socket) {
        log.Println("on connection")
        so.Join("chat")
        so.On("chat message", func(msg string) {
            log.Println("emit:", so.Emit("chat message", msg))
            so.BroadcastTo("chat", "chat message", msg)
        })
        so.On("disconnection", func() {
            log.Println("on disconnect")
        })
    })
    server.On("error", func(so socketio.Socket, err error) {
        log.Println("error:", err)
    })

    // register the handler, to the endpoint, 
    // convert the websocket server (which is http.Handler) to iris.Handler
    // using the iris.ToHandler func helper.
    iris.Any("/my_endpoint", iris.ToHandler(server)) 

    iris.Listen(":8080")
}

Test that and tell me the results, its client side it's the socket.io library you already know, for more info go there: https://github.com/googollee/go-socket.io .

I pushed the new changes to this repository, could you please check if your issue has been solved? (using the latest go-websocket with iris v6.1.2 go get -u github.com/kataras/iris) I can't stop the development until your issue fixed :) The library is good and fast for websocket-only usage and before the 'pause' of the development I want to be sure that it 'just works' with its current features.

antlaw commented 7 years ago

Sure. Tested both problems. Here is the result:

  1. The problem of c.Leave() is fixed. That means the following works without problem.
    
    // OnDisconnect clean up things when a client is disconnected
    func OnDisconnect(c iris.WebsocketConnection) {
    c.Leave("server2")
    fmt.Println("OnDisconnect(): client disconnected!")

}


2. If client sends message to server immediately after connection, the server will not handle it. (same as before).

One more thing, if I use IRIS 5, problem 1 will still be there.
ghost commented 7 years ago

I just finished with the cors middleware and a cors router plugin, sorry for late answer.

Yes, iris v5 uses the kataras/go-websocket 0.0.3 tag, so you should update using: go get -u gopkg.in/kataras/go-websocket.v0

But I don't understand the 2. , if client sends a message to the server after connection closed what do you expect to happens?

antlaw commented 7 years ago

I expect that every message sent from client should be received by the server successfully, if there is no network problem. In the code below, the message sent by the line "err = SendMessage("2", "all", "join", "dummy2") will not be received by the server, unless I uncomment the line //time.Sleep(time.Second).


// ClientLoop connects to websocket server, the keep send and recv dataS
func ClientLoop() {
    for {
        time.Sleep(time.Second)
        err := ConnectWebSocket()
        if err != nil {
            fmt.Println("failed to connect websocket", err.Error())
            continue
        }
        //time.Sleep(time.Second)
        err = SendMessage("2", "all", "join", "dummy2")
        go sendUntilErr(2)
        recvUntilErr()
        err = CloseWebSocket()
        if err != nil {
            fmt.Println("failed to close websocket", err.Error())
        }
    }

}
ghost commented 7 years ago

Here you're sending the message before websocket close and closing the connection immediately, are you sure that is this a 'server-side' issue and not client-side issue? We don't have a golang client-side specific library for kataras/go-websocket, so I am assuming that you're using the gorilla's one?

antlaw commented 7 years ago

The function recvUntilErr() will receive messages until disconnection. You may refer the source of the whole testing program to the first message in this thread. Here is recvUntilErr() :


func recvUntilErr() {
    var msg = make([]byte, 2048)
    var n int
    var err error
    i := 1
    for {
        if n, err = WS.Read(msg); err != nil {
            fmt.Println(err.Error())
            return
        }
        fmt.Printf("%v Received: %s.%v\n", time.Now(), string(msg[:n]), i)
        i++
    }

}
ghost commented 7 years ago

Yes but we don't have an WS.Read function anywhere. I can see from the previous comment that you use the "golang.org/x/net/websocket" but why? go-websocket is based on gorilla 's implementation not x/net/websocket's . I don't know the implementation of /x/net/websocket but I have read a lot of comments that it has many bugs and it is not recommended for use, so maybe it's not kataras/go-websocket issue.

I can't be sure what's going on if I don't have something to investigate and to test on.

Could you please make a private or public github repository with a gorilla/websocket client-side + kataras/go-websocket server-side example which re-produces the issue (because I wasn't able to reproduce that) and share the link with me in order to have a shared source code base of the problem? And I promise you that the issue will be solved very fast with this way. Thanks for your time!

antlaw commented 7 years ago

Yes, I cannot 100% sure the problem is from the server, but my original client is written in c#, and I encountered the problem. I tried a couple of things and experienced that if the client wait for 500ms or so before sending the first message, the problem will not occur anymore. Then I write the tester program (source code below) and found that a simple golang client will also hit the same problem as well. Therefore, I suspect that the problem is from the server.

I am not familarize with posting on public github repository, so I'd rather post it here. You can see that the client is using "golang.org/x/net/websocket".

To run the program is easy:

  1. go run main.go server // it will listen and process incoming request.
  2. go run main.go client // it will connect the server, join a broadcast group, and keep sending/receiving data.

package main

import (
    "fmt"
    "os"
    "strings"
    "time"

    "github.com/kataras/iris"

    "golang.org/x/net/websocket"
)

// WS is the current websocket connection
var WS *websocket.Conn

func main() {
    if len(os.Args) == 2 && strings.ToLower(os.Args[1]) == "server" {
        ServerLoop()
    } else if len(os.Args) == 2 && strings.ToLower(os.Args[1]) == "client" {
        ClientLoop()
    } else {
        fmt.Println("wsserver [server|client]")
    }
}

/////////////////////////////////////////////////////////////////////////
// client side
func sendUntilErr(sendInterval int) {
    i := 1
    for {
        time.Sleep(time.Duration(sendInterval) * time.Second)
        err := SendMessage("2", "all", "objectupdate", "2.UsrSchedule_v1_1")
        if err != nil {
            fmt.Println("failed to send join message", err.Error())
            return
        }
        fmt.Println("objectupdate", i)
        i++
    }
}

func recvUntilErr() {
    var msg = make([]byte, 2048)
    var n int
    var err error
    i := 1
    for {
        if n, err = WS.Read(msg); err != nil {
            fmt.Println(err.Error())
            return
        }
        fmt.Printf("%v Received: %s.%v\n", time.Now(), string(msg[:n]), i)
        i++
    }

}

//ConnectWebSocket connect a websocket to host
func ConnectWebSocket() error {
    var origin = "http://localhost/"
    var url = "ws://localhost:9090/socket"
    var err error
    WS, err = websocket.Dial(url, "", origin)
    return err
}

// CloseWebSocket closes the current websocket connection
func CloseWebSocket() error {
    if WS != nil {
        return WS.Close()
    }
    return nil
}

// SendMessage broadcast a message to server
func SendMessage(serverID, to, method, message string) error {
    buffer := []byte(message)
    return SendtBytes(serverID, to, method, buffer)
}

// SendtBytes broadcast a message to server
func SendtBytes(serverID, to, method string, message []byte) error {
    buffer := []byte(fmt.Sprintf("go-websocket-message:%v;0;%v;%v;", method, serverID, to))
    buffer = append(buffer, message...)
    _, err := WS.Write(buffer)
    if err != nil {
        fmt.Println(err)
        return err
    }
    return nil
}

// ClientLoop connects to websocket server, the keep send and recv dataS
func ClientLoop() {
    for {
        time.Sleep(time.Second)
        err := ConnectWebSocket()
        if err != nil {
            fmt.Println("failed to connect websocket", err.Error())
            continue
        }
        time.Sleep(time.Second)
        err = SendMessage("2", "all", "join", "dummy2")
        go sendUntilErr(2)
        recvUntilErr()
        err = CloseWebSocket()
        if err != nil {
            fmt.Println("failed to close websocket", err.Error())
        }
    }

}

/////////////////////////////////////////////////////////////////////////
// server side

// OnConnect handles incoming websocket connection
func OnConnect(c iris.WebsocketConnection) {
    fmt.Println("socket.OnConnect()")
    c.On("join", func(message string) { OnJoin(message, c) })
    c.On("objectupdate", func(message string) { OnObjectUpdated(message, c) })
    c.OnDisconnect(func() { OnDisconnect(c) })

}

// ServerLoop listen and serve websocket requests
func ServerLoop() {
    // // the path which the websocket client should listen/registed to ->
    iris.Config.Websocket.Endpoint = "/socket"
    iris.Websocket.OnConnection(OnConnect)
    iris.Listen("0.0.0.0:9090")

}

// OnJoin handles Join broadcast group request
func OnJoin(message string, c iris.WebsocketConnection) {
    t := time.Now()
    c.Join("server2")
    fmt.Println("OnJoin() time taken:", time.Since(t))
}

// OnObjectUpdated broadcasts to all client an incoming message
func OnObjectUpdated(message string, c iris.WebsocketConnection) {
    t := time.Now()
    s := strings.Split(message, ";")
    if len(s) != 3 {
        fmt.Println("OnObjectUpdated() invalid message format:" + message)
        return
    }
    serverID, _, objectID := s[0], s[1], s[2]
    err := c.To("server"+serverID).Emit("objectupdate", objectID)
    if err != nil {
        fmt.Println(err, "failed to broacast object")
        return
    }
    fmt.Println(fmt.Sprintf("OnObjectUpdated() message:%v, time taken: %v", message, time.Since(t)))
}

// OnDisconnect clean up things when a client is disconnected
func OnDisconnect(c iris.WebsocketConnection) {
    c.Leave("server2")
    fmt.Println("OnDisconnect(): client disconnected!")

}
ghost commented 7 years ago

OK, this is very helpful, I'll start to see what's going on, in the meanwhile stay online please because I may ask you some questions, thanks for sharing!

ghost commented 7 years ago

It's running well, can you tell me what code you comment and it's failing because I tried to comment these:

    // THIS:    time.Sleep(time.Second)
        err := ConnectWebSocket()
        if err != nil {
            fmt.Println("failed to connect websocket", err.Error())
            continue
        }
        // AND THIS time.Sleep(time.Second)

and

    i := 1
    for {
        // THIS time.Sleep(time.Duration(sendInterval) * time.Second)

and the app was running and sending messages as expected, maybe I didn't understand the issue, could you explain it with comments in the code please?

antlaw commented 7 years ago

When the client connects to the server, you can see that the server outputs something like this: ... Sat, 28 Jan 2017 01:21:21 GMT: Running at 0.0.0.0:9090 socket.OnConnect() OnJoin() time taken: 0s OnObjectUpdated() message:2;all;2.UsrSchedule_v1_1, time taken: 0s OnObjectUpdated() message:2;all;2.UsrSchedule_v1_1, time taken: 0s OnObjectUpdated() message:2;all;2.UsrSchedule_v1_1, time taken: 0s OnDisconnect(): client disconnected!

If you see the line "OnJoin() time taken...", that means the server is able to receive the first message from the client. "OnObjectUpdated() message:2" are subsequent messages received.

OK, when u comment the second time.Sleep() in the ClientLoop() and run the client again, the server console will not output "OnJoin() time taken...", yet it is still able to receive the subsequent messages. That means the first message is missed.


// ClientLoop connects to websocket server, the keep send and recv dataS
func ClientLoop() {
    for {
               // since we are in infinite loop, we wait for a while here to prevent 100% CPU usage in case the server is not running. 
        time.Sleep(time.Second)
        err := ConnectWebSocket()
        if err != nil {
            fmt.Println("failed to connect websocket", err.Error())
            continue
        }
                // comment the line below will trigger the bug
        //time.Sleep(time.Second)
               // this is the first message to be sent to server
        err = SendMessage("2", "all", "join", "dummy2")
               // this goroutine will keep send message to server until diconnection
        go sendUntilErr(2)
        recvUntilErr()
        err = CloseWebSocket()
        if err != nil {
            fmt.Println("failed to close websocket", err.Error())
        }
    }

}
ghost commented 7 years ago

Yes... I can see it now, so the join (which also is just a websocket custom message) is injected by other writer (the emitter to all), right?

antlaw commented 7 years ago

There are only two kind of messages:

  1. Join request. It tells the server that the client wants to join a broadcast group.
  2. Object Update. It tells the server that "I have an object updated". The server will broadcast this message to the broadcast group. Therefore, once the client sends object update, it will receive the same message broadcast from the server.

I guess that you may now figure out the logic behind: if the join request is missed somehow, the client will not receive broadcast from server. Then, in the client side, you can only see that it keeps sending message to server, but nothing can be received, because it fails to join the broadcast group.

ghost commented 7 years ago

OK, Finally. It should be fixed now. The issue was the ping handler, I set it and change the order of the events, (both "On" and "Emit" works as before without issue*)

I tested it with 800 clients start running at the same exactly time, I fixed also a bug on writer.

Please verify that it works as expected :)

antlaw commented 7 years ago

Thanks very much for your prompt response. Yes, I rerun the test program and confirm that both problems are fixed.

btw, do u know how many clients a web socket server can handle concurrently? I understand that this question is not a good one as that depends on the hardware configuration of the server and also the message rate. Put it this way: how many concurrent clients do you think it can handle? (I read a blog saying that the writer has tested a single server can handle 600K concurrent clients with 1 message/sec and it occupied about 64Gb Ram.)

ghost commented 7 years ago

Thank you for your report, you really helped me to solve that, it was bigger than I originally thought!!!

As for your question, do not trust blogs about these things because it depends on many variables around your app. There is not so-much software limit to that (for the kataras/go-websocket and iris) but if you use channels in your app it depends on how much capacity your buffers will be. Most go (websocket+http) apps can very much large number of connections without the need of making something special, go is really fast on that field comparing to other solutions( I think this is the answer to your question). If it goes to a space that your server can't handle more and it leaks then you have two options: horizontal scale (different machines) or vertical (same machine but better hardware), and if you think that your code is the problem then search for it, it depends I can't really help you without a real-world example of your live-app :/

antlaw commented 7 years ago

Welcome. That's the way I can also contribute to the IRIS community. If you find that the test program is useful as an example, please feel free to put it as one of the IRIS example.

Correct me if I am wrong: websocket + http is horizontally scalable if the solution does not require broadcast capability. Imagine that we are going to create a real time map in game, where a user's realtime activity will be broadcasted to nearby users. If we have multiple websocket servers and they do not connect to each other, we will encounter a case that an update cannot be broadcasted to all nearby users, because they are served by different websocket servers. To solve the problem, I can only think of a two-layer arhitecture: create a parent Websocket server and all others are connected to it. Every broadcast message will be passed to the parent server, which will then be broadcasted to all other web socket servers. Such a solution is workable, but not elegant enough.

ghost commented 7 years ago

If you find that the test program is useful as an example, please feel free to put it as one of the IRIS example.

I already did that on my local computer (:P), I'll commit the changes to the iris-contrib/examples repo too with your name below, but you can do a PR if you feel that you have time here.

create a parent Websocket server and all others are connected to it.

That's one solution but you don't win a lot because that server will have the same traffic and processes if it contains the full logic, a thing like this to work should have the minimal code and must escape fast, so I'm thinking a better solution a load balancer websocket server which will 'redirect' the user to one of the n (or the 'nearest' ) websocket servers and let them to do the job, so this websocket server will connect the user with the rest of the websocket servers, no the opposite, correct ?

antlaw commented 7 years ago

a load balancer websocket server which will 'redirect' the user to one of the n (or the 'nearest' ) websocket servers and let them to do the job

Here you may have an assumption that broadcast messages are sent to only a subset of users, but not all. The loadbalancer needs the logic to determine which websocket server to redirect to (the nearest one). We also need a logic to allow certain messages to be broadcasted to everyone. Let's take a look a use case:

In a multplayer online game, each player is located in a region of a map. They can move from one region to another. Things happened in a region has to be broadcasted to every players in that region. Also, messages from game master has to be broadcasted to every players.

In the above case, we actually have two challenges:

  1. How to arrange player connection so that they are in the same "region"? The load balancer needs to have logic to determine which websocket server should a player connect to. More, if a player moves from a region to another, there should be logic to disconnect from the old websocket server and then connect to a new one. Likely, the IRIS user has to implement the two logic himself, as the load balancer do not have the knowledge to do so.

  2. For system messages that have to be broadcasted to every player, every websocket server has to do the job.

The above use case is quite common. Says, in a stock market, an end-user needs to receives realtime stock quote of a particular stock symbol (says, NASDAQ:APPL), which is a logical partion like region mentioned above. At the same time, every end-user needs to receive market indices update (says, NASDAQ, and DOW changes), which are broadcast-to-all messages.

If load balancer solution exposes interfaces to allow programmer to implement the two logic above, then it works nicely.