illinois-cs241 / broadway

A distributed systems framework used running distributable workloads.
Other
18 stars 0 forks source link

Fix stats reset bug when websocket worker nodes rejoin. #18

Closed ezhang887 closed 4 years ago

ezhang887 commented 4 years ago

Addresses #17.

Checks to see if self.worker_node exists before creating a new one.

ayushr2 commented 4 years ago

Thanks for looking into this Eric! So the scenario we want to handle is: when a worker node disconnects (and its corresponding WorkerConnectionHandler object is deleted because the connection broke). But it exists in the DB still. Then when it reconnects again, we want to sustain the worker node object from the DB and just flip the worker_node.is_alive boolean to true.

In your approach you check if self.worker_node is None or not. Which will not work because we do not sustain this object across connections. You will need to see if the worker had connected before or not by querying the DB. We already attempt to do that in https://github.com/illinois-cs241/broadway/blob/master/broadway/api/handlers/worker_ws.py#L54-L77. But our implementation is buggy.

ezhang887 commented 4 years ago

Should I merge this now then?

ayushr2 commented 4 years ago

Yes go ahead!

ayushr2 commented 4 years ago

A cool feature about Github is that you could write "Resolves #issue-number" in your description and it will automatically close the issue when the PR is merged. So just replacing "Addresses" with "Resolves" in your description would have done the job for this PR.