Open samliew opened 3 years ago
I am thinking about having a single main instance with a message queue that scrapes all available elections (as per #60 ), stores the results in-memory (we could leverage Redis, come to think of it), and then distributes incoming messages to worker processes.
I think this will allow us to avoid spinning multiple Heroku instances (and thus eat up on dyno hours) unless strictly necessary (i.e. for some reason SE decides to have more than 3-4 elections at once [I assume 4 as a standard number we can spawn at any given time, but I am not sure what Heroku offers in terms of available cores]).
The main process can then check where the message comes from, get the corresponding election data, and put the response into the message queue. In the meantime, the worker processes would simply be responsible for responding to rooms while keeping the load balanced.
What do you think about this setup?
This also can address #58 if we determine that single account for multiple elections is not viable because each worker could be logged into a different account and respond accordingly
What do you think about this setup?
Sounds viable, just have to confirm whether the limits for chat messages for the single account is suitable for this purpose before you start building it.
Taken from: https://meta.stackexchange.com/questions/164899/the-complete-rate-limiting-guide/164900#chat
whether the limits for chat messages for the single account is suitable for this purpose
Yeah, I considered the limit - it's quite restrictive, and there are some things I want to clarify about the throttling before going ahead with the implementation. For example, unless I am missing something, it is unclear whether the limit is: (1) network-wide; (2) per chat server; (3) per room.
The worst-case scenario is (1) which would make the single-account solution much less attractive (that said, another Meta confirms that you can post 2 messages in quick succession before the 1 second throttle kicks in, so given the default throttle of 2 seconds the bot currently has between messages, it should be enough to cover several moderately active election rooms without a significant delay). On the other hand, scenarios (2) and (3), although unlikely, would be a blessing for the solution.
Likely it's per-server. That said, if the bot is running for three concurrent elections on Chat.SE, that might already be pushing it. Not to forget elections run for about 3 weeks each so there may be overlaps (not sure if we had more than 3 at the same time previously?)
Likely it's per-server
That's encouraging (although need to confirm that), that would give us the ability to separate out SO elections and the rest. But yeah, that's not that much of an improvement. I am going to make an analysis of the previous elections a bit later on to see how many we realistically have to support at the same time - if it is 3 or less, we should be all good with a single instance.
But if not - well, nothing stops us from taking the best of both worlds and spawning extra instances as needed :) Granted, it may require persistent storage as referred to by #62 but I am unsure if even have to share election scraping data between instances - it seems like there's no downside to just rescraping all the elections for every spawned instance (or is there?)
One more point for consideration when using a single chat account being in multiple rooms on Chat.SE, is that the bot needs to know which room sent the triggering message and respond in that room only.
One more point for consideration when using a single chat account being in multiple rooms on Chat.SE, is that the bot needs to know which room sent the triggering message and respond in that room only.
Yeah, I think the new version of ChatExchange should be able to provide the necessary distinction. I am not sure why there is an issue #73 with the current version of the package, but the unpublished one uses the room_id
field sent by the socket to set the roomId
on the message that's passed to subscribers. It is likely we will be able to just reapply 0b95950 after upgrading.
Following an implementation of #60 (Create a separate program/process to scrape election status on all network sites),
We could have another "main" process that spins up more instances/processes for each election when an election is detected, and terminates them automatically N days after it ends or is cancelled.
This main process then could also "own" the development chatroom test instance if started by a dev.
We then may need to create a dev-only UI/API to manually start/stop instances, as well as override variables for each election instance.