magpie-ea / magpie-backend

Backend for _magpie
MIT License
3 stars 2 forks source link

Use a separate process to monitor client disconnection #51

Open x-ji opened 5 years ago

x-ji commented 5 years ago

The terminate/2 callback might not be foolproof. See https://elixirforum.com/t/presence-not-picking-up-user-leave-events/19045

JannisBush commented 5 years ago

This is quite important. I already had to recreate my iterated-interactive-test-experiment several times while debugging, because a chain breaks if the sever thinks there is still an active participant, but the participant has already left.

x-ji commented 5 years ago

When does it happen for your test cases? The current terminate/2 callback should work when the user closes their tab or reload the page. It is only supposed to fail when the internet connection is lost. Though maybe it is related to something else in this case.

x-ji commented 5 years ago

Are you testing on a local server or a remote one. Currently all experiment statuses should be reset to 0 on server restart, in case the disconnection detection fails. But on remote of course restarting the server would be more difficult.

JannisBush commented 5 years ago

I tested it on https://babe-demo.herokuapp.com/. With all experiment statuses, you mean all statuses set to 1/in progress?

I think it happened most often, when I had two participants in a game, reloaded one tab (disconnected), pressed ok on the notification on the other tab and then was reassigned to the same game again in the first tab. The participant in the second tab now generates the issue, or something similar. I don't remember/figured it out completely, but I could try to reproduce the problem.

Not really sure, if this is the problem or would solve it, but maybe explicitly leaving the channel after a "presence_diff" would help?

babe.gameChannel.on("presence_diff", (payload) => {
    if (babe.gameFinished == false) {
        window.alert(
            "Sorry. Somebody just left this interactive experiment halfway through and thus it can't be finished! Please contact us to still be reimbursed for your time."
         );
    };
});
x-ji commented 5 years ago

Yes all in progress ones should be reset.

That suggestion makes sense. Could you try it? It should be something like babe.participantChannel.leave() and babe.gameChannel.leave().

JannisBush commented 5 years ago

Yes, this was the issue.

After a participant clicked ok on the notification, he could still submit his results with the end dialogue button. Adding babe.participantChannel.leave() and babe.gameChannel.leave() prevents this and the user can't do anything anymore. A babe.jumpToView(view_id)-function or something similar would be nice, to redirect the user to an end view, but this is a front-end matter.

jmadeano commented 5 years ago

I have run into similar issues with my iterated experiment. After some time, it is impossible to continue the experiment because there is a never-ending queue (due to unregistered exits). Fully restarting the Heroku server works, but that's not ideal. If there isn't a better way to track exits, it would be helpful if all connections were reset with "Toggle activation status" so there is less downtime (versus a full restart).

jmadeano commented 5 years ago

I'm not sure if this is the only source of unregistered exits, but I have noticed that this happens each time that I test the the experiment on MTurk Sandbox. Starting the experiment and then returning before finishing or refreshing the whole page reliably causes this failure (it might have to do with the fact that the experiment is rendered within a frame in Mturk).

x-ji commented 5 years ago

Thanks a lot for the report and instructions for reproduction. I guess the way iframe works needs some special attention. Maybe I should try to implement more reliable mechanisms as described in this issue.

The suggestion sounds good. There can also be a dedicated button to resetting experiment statuses in the UI. I'll add that first. The separate monitoring process could take a while to implement.

x-ji commented 5 years ago

@jmadeano-Shell Now the toggle button should reset all experiment statuses. https://github.com/babe-project/BABE/issues/59 Please give it a try.

jmadeano commented 5 years ago

I pushed the change to Heroku and the new toggle seems to be working great. For now, I plan to simply run the experiment on MTurk in small batches and monitor/reset the connections in between batches, so this is a great help. Thanks for the quick response!

x-ji commented 5 years ago

@jmadeano-Shell I pushed a new version using a monitoring process. Not sure if this would solve the problem. Note that according to https://stackoverflow.com/questions/33934029/how-to-detect-if-a-user-left-a-phoenix-channel-due-to-a-network-disconnect it might take up to 90 seconds for the disconnect to be detected though. If still nothing happens after that much time on the MTurk case, then I'd need to think of another solution.