django / channels

Developer-friendly asynchrony for Django
https://channels.readthedocs.io
BSD 3-Clause "New" or "Revised" License
6.13k stars 800 forks source link

best practice: communication from workers -> client #1079

Closed hyusetiawan closed 6 years ago

hyusetiawan commented 6 years ago

high level goal:

I will run a long running process and the browser client will incrementally send messages and the websocket consumer in the back will process by sending a message to workers consumer, then once finished, sending the processed data to the browser client.

The way I have accomplished this is, from the websocket consumer, i'd pass self.channel_name with it as the message like so:

        async_to_sync(self.channel_layer.send)('worker', {
            'type': 'product_data',
            'channel': self.channel_name,
            'rows': [],
        })

and once the worker finishes:

async_to_sync(self.channel_layer.send)(data['channel'], {
            'type': 'worker_data',
            'rows':  [], # already processed
        })

and in the websocket consumer, i'll have a worker_reply() that contains:

        self.send(text_data=json.dumps({
            'message': data,
            'worker': True,
        }))

while this works, is this the best way to accomplish it? This has been asked before, tried reading it but I do not understand: https://stackoverflow.com/questions/50199118/how-to-use-channelnamerouter-to-communicate-between-worker-and-websocket-django

another way I can think of is to create a group, but I dont know how to make it so that the processed data returns to the specific group and if I have to pass the group name to the worker, might as well pass the channel name like i did above.

Thanks!

andrewgodwin commented 6 years ago

I'm afraid I can't give out architectural/programming advice much here - stack overflow was the better place for that - but I'll give a few quick comments.

What you're doing will work, and using a group would actually be worse as you can't tie those to workers (and you want to return your data to a specific connection, so using a direct channel makes more sense).

However, I would encourage you to think about what happens if connections drop or browsers reconnect - do you want the result to still come back? In that case, you need an identifier that is not based on the single socket connection (like a user ID) and then groups for that makes sense.

hyusetiawan commented 6 years ago

Thank you for the quick comments, should I post on stack overflow to get more in-depth feedback? If so, what extra information would you need in that post?

After rethinking this, I might end up have to introduce a full on job scheduler like celery or dramatiq

On Mon, Jun 11, 2018 at 3:23 PM Andrew Godwin notifications@github.com wrote:

I'm afraid I can't give out architectural/programming advice much here - stack overflow was the better place for that - but I'll give a few quick comments.

What you're doing will work, and using a group would actually be worse as you can't tie those to workers (and you want to return your data to a specific connection, so using a direct channel makes more sense).

However, I would encourage you to think about what happens if connections drop or browsers reconnect - do you want the result to still come back? In that case, you need an identifier that is not based on the single socket connection (like a user ID) and then groups for that makes sense.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/django/channels/issues/1079#issuecomment-396406241, or mute the thread https://github.com/notifications/unsubscribe-auth/ABA4CPnde2kZfTxFgBKjLuOIfb5OGDG7ks5t7u3sgaJpZM4UiKKQ .

andrewgodwin commented 6 years ago

I don't read Stack Overflow, so I don't know what you might need - I don't generally go around trying to give everyone technical help as otherwise it would take all of my time, unfortunately.

There's nothing wrong about using a "proper" job scheduler, either. They have a lot more features you might want.

hyusetiawan commented 6 years ago

I have opted to use the group name, so that it's more persistent, last question if I may, how can I detect that the group is empty? that way I can stop the worker from doing unnecessary work

andrewgodwin commented 6 years ago

You can't detect if a group is empty or not. You need to use a separate datastore to track presence.