Open maddi4u opened 2 years ago
Nah. This way you'd have to screen-record a virtual browser. That is resource-heavy and lossy. (Guess why did I choose this name :) ) But this is a good feature and I had thoughts about it already. So my plan is to listen to the messages and include them as subtitles. I only don't know how to set them to be displayed best.
Can we look at how the stripchat or instagram live does that, I mean how do they add comments to the stream as shown above. Perhaps page inspect/inspect elements be of any help.
https://github.com/lay295/TwitchDownloader
That's the prrof of concept for Twitch (GUI just Windows). It's a complex task to capture, render and put everything together. It's in general more a post process task. So TwitchAutomator also does this in post process: https://github.com/MrBrax/TwitchAutomator
For that it uses some tools which are composed together. E.g. for VODs they use https://github.com/PetterKraabol/Twitch-Chat-Downloader to download the chat. Keep in mind that we need to do it live because the most sites have no replay.
To capture messages it's helpful to check the developer docs. A few APIs might provide access to the chat (so they have bots which need this too). If this is documented is a seperate question. If they have no documentation it's mostly a task of reverse engeneering the obfuscated JavaScript stuff which is often a horror. It can also be necessary to have code which is able to parse JavaScript.
If a browser is needed check the sadly dead PhantomJS (https://github.com/ariya/phantomjs) and Splash (https://github.com/scrapinghub/splash) which are both made for such tasks. Scraping stuff which uses JavaScript. They are mostly a standardized JavaScript enabled browser with small load and the core features in case of Splash with HTTP API to drive it.
Some research and if needed bringing all together in a stack would make it possible to do this. Keep in mind that all of this brings heavy complexity which in my opinion would make it necessary to have better modularisation.
Best Regards
The most common case is that you can join on some websocket endpoint and you can receive the messages in real-time. There are some different approaches on different sites, but most are similar.
For me it seems easier to implement than you think @DerBunteBall, just requires time, which I will rarely have in the next few weeks. Also I don't like these JS magics. I usually use the network inspector in the browser and try to figure out the queries that the sites send. And what's wrong with the current modularisation? I know there are some quirks, but I think it's good for the job.
My plan is the following: I have to implement a function for every site that listens to the messages and converts them to a common format. The next step is to create subtitles and the last is to mux them into the video. I am thinking about making a renderer which converts the chat messages to images. After it I can make bitmap subtitles that can be embedded in the output in a tidy way like on Twitch. Then you can enable it in the video player if you want and it won't be persistent. A question is that will the muxing require post-processing or it can be added simultaneously with the video.
> For me it seems easier to implement than you think @DerBunteBall, just requires time, which I will rarely have in the next few weeks. Also I don't like these JS magics. I usually use the network inspector in the browser and try to figure out the queries that the sites send. And what's wrong with the current modularisation? I know there are some quirks, but I think it's good for the job. As I said the main task is reverse engeenering the website code. When the site uses websockets it will be easy to capture the data. But when they use other methods it can be get up to the question of parsing JavaScript. E.g. also youtube-dl and yt-dlp from time to time need this to extract a video. I think the post processing is a bit more complex. I think e.g. handling colors, emoticons and so on in the subtitles is eventually a thing which doesn't do it self. Also when everything is output by the web socket. For the code design/modularization (just a few ideas):
Just a few thoughts.
Best Regards
Save the stream with the chats as we will also be able to experience and enjoy what's happening in the room.
May be like this: (I have blurred the image)
" https://ibb.co/b2S1jDk "
Upper part without blur Lower part without blur
The original video is in this way with the chats on left