How does the cache work?

tcoopman commented 4 years ago

I'm trying to create something similar as the Topic but with other information in. Basically Ii want to create a cache key -> t that also cleans up when there are no more users of the key. I thought that this was what Cache should be doing.

But either I'm misunderstanding something or I'm doing something wrong here, but I have some behaviour that sometimes works and sometimes it doesn't

This is a summary of the relevant code:

module Cache = Cache.Ephemeral (struct
  type t = key
  let equal (Key k1) (Key k2) = k1 = k2
  let hash = Hashtbl.seeded_hash
end)

let create cache topic_name push =
  let%lwt value = Cache.find_opt cache ~key:topic_name in
  let%lwt topic =
    match value with
    | None ->
        Stdlib.print_endline "CREATING A TOPIC" ;
        let topic = ReWeb.Topic.make () in
        let%lwt () = Cache.add cache ~key:topic_name topic in
        topic |> Lwt.return
    | Some topic ->
        Stdlib.print_endline "FOUND A TOPIC" ;
        topic |> Lwt.return

When I open 2 tabs, the first tab prints correctly CREATING A TOPIC but the second one sometimes prints FOUND A TOPIC and sometimes CREATING A TOPIC. I have no idea why that is (the websocket connection on the first tab is still open - I can still push things to it correctly).

I'm wondering if it is because of this line. It's probably something else, and I haven't read into Ephemerals too much, but I don't understand why you are cleaning there.

Any help would be appreciated, thanks!

yawaramin commented 4 years ago

Hi Thomas, the ephemeral cache works by retaining the key-value pair as long as the key is in scope somewhere else, but drops it at some GC run after the key goes out of scope. So based on the observed behaviour it seems like the topic_name value is sometimes going out of scope?

Btw, I think we can rule out that cleanup line in find, because it's not called for find_opt. It was more an optimization to keep the hash table size trimmed for topics. But come to think of it, I should probably do the cleanup in find_opt as well.

tcoopman commented 4 years ago

Does the clean-up clean everything or just what's not in scope? It's not clear to me why that is exactly needed?

Ok, I'll see how I can find out when the string is not in scope anymore, but it's part of the websocket handler so I guessed it would stay in scope. I also don't know enough about ocaml internals/GC to have any good intuition on this, so it will be some searching around.

Actually I was wondering if it would be possible to get some kind of signal when the websocket drops/is closed in re-web, because then I could do the cleaning manually.

yawaramin commented 4 years ago

It would force a cleanup of any key-value pairs whose keys are not in scope. This is done automatically when the ephemeron-based hash table is resized, but resizing is non-deterministic, so I wanted a slightly stronger guarantee that cache memory wouldn't grow too much. See https://caml.inria.fr/pub/docs/manual-ocaml/libref/Ephemeron.K1.MakeSeeded.html

The topic_name value should stay in scope as long as the WS handler passes it in to itself recursively, but not otherwise because it would go out of scope as soon as one invocation of the handler function ended and its stack got cleared out.

Regarding getting an indication of when the socket gets closed, that's currently not possible...

tcoopman commented 4 years ago

The topic_name value should stay in scope as long as the WS handler passes it in to itself recursively, but not otherwise because it would go out of scope as soon as one invocation of the handler function ended and its stack got cleared out.

I've been able to make a smaller reproduction of the problem. The issues presents itself when the topic_name is part of the path, for example here: | (GET, ["ws", name]) => socket(cache, topic_name)`

I've created a gist with the relevant files here: https://gist.github.com/tcoopman/aa6727cef26fa8b734a150ae62c6fe74 If you use the index.js file and open a few tabs, you'll see you sometimes get CREATING A TOPIC as output. If you change name to "a_fixed_name" here https://gist.github.com/tcoopman/aa6727cef26fa8b734a150ae62c6fe74#file-app-re-L9 this doesn't happen.

I've thought about passing the name to the handler as well as an extra argument, but that doesn't fix it.

Regarding getting an indication of when the socket gets closed, that's currently not possible...

Is that because of the architectural design, or just something that's not yet implemented?

yawaramin commented 4 years ago

Ah, I think I understand what you're trying to do–it's like a chat server where people can connect to chat rooms by a topic name, and it gets created automatically if it doesn't exist, and deleted automatically as soon as no one is connected to it any more.

This is difficult with the current design of the ephemeral cache because it works by using the physical (heap pointer) location of the cache key to decide whether the key-value pair should be kept or removed. So in your example, each time the socket service is invoked, it allocates a key by calling Context.Key(name), but doesn't actually store it anywhere. So each client that connects allocates a new key and looks up the topic in the ephemeral cache using that key.

Because the key is defined to use string value comparison in Context.Cache, it sometimes successfully looks up the topic, but sometimes fails and creates a new topic, because the topic's key-value pair has been cleared out by that time (because the allocated key went out of scope).

One way to get around this limitation is to have a single topic for all the 'chatrooms' and define a more structured message (maybe using a JSON encoding) to communicate with clients. The client-server WS communication can look something like:

Client: GET /ws
Server: (responds with WS)
C: { "tag": "join", "payload": { "room": "room_name" } }
S: (filters the messages it sends to the client so that it only sends messages with "room": "room_name")

To answer your last question, there's no signal when the WS gets closed. This is by design currently: see https://github.com/yawaramin/re-web/blob/0103f3155993e68de614ffe183abdf4960786fc2/ReWeb/Server.ml#L84 . What happens is when the connection is closed, the handler is spun down without any warning. I've thought about how to actually give the WS handler a chance to gracefully exit. One method might be to change the return type of https://yawaramin.github.io/re-web/re-web/ReWeb/Response/index.html#type-pull from string option Lwt.t to (string, [> `Timeout | `Connection_close]) result Lwt.t, to convey what exactly happened. Would this be preferable, do you think?

tcoopman commented 4 years ago

Ok, thank you for explaining. It's clear to me know why this sometimes works and sometimes fails.

One way to get around this limitation is to have a single topic for all the 'chatrooms' and define a more structured message (maybe using a JSON encoding) to communicate with clients.

Would the goal then be that the room_name is stored in the handler and that when the Topic.pull produces a messages, only the messages that are for room_name are pushed?

Would this be preferable, do you think?

This certainly would be nice. It is nice that you can do some cleanup/detect when a client has been disconnected.

yawaramin commented 4 years ago

Would the goal then be that the room_name is stored in the handler and that when the Topic.pull produces a messages, only the messages that are for room_name are pushed?

Right, in fact thinking about it you can probably have a route like: GET /ws/room_name and pass the room_name to the service and then on to the WS handler, like:

let socket = (~room_name, _) => {
  let%lwt subscription = Topic.subscribe(topic);
  let rec handler = (~room_name, pull, push) => {
    // handler will push only messages with { "room": room_name }
  };

  handler(~room_name) |> Response.of_websocket |> Lwt.return;
};

This certainly would be nice. It is nice that you can do some cleanup/detect when a client has been disconnected.

Agreed, in retrospect just having string option Lwt.t is a little too anemic. I'll make the change (but it will be a breaking change).

yawaramin / re-web

How does the cache work? #22