ajayyy / DeArrow

Crowdsourcing better titles and thumbnails on YouTube
https://dearrow.ajay.app
GNU General Public License v3.0
1.24k stars 33 forks source link

Idea: Download YouTube subtitles and ask chatGPT to come up with a title name #110

Open UltraHDR opened 11 months ago

UltraHDR commented 11 months ago

Hi, thanks for this addon.

It'll be cool to automate the following

  1. Download a video subtitles from a service like https://downsub.com/
  2. paste that text to chatGPT and ask it generate a title.
  3. Upload that title to a server so others can benefit from it

Thanks

leumasme commented 11 months ago

Using yt-dlp to download the subtitles and the gpt3.5 api, this is pretty trivial - In fact I already have a project that is capable of doing this, which I host as a private discord bot for my friends.
image The problem is the cost - Using the gpt3.5-16k api, a 30 minute video costs about 2 cent. This is effectively nothing for on-demand private usage, but when scalining it up to a public service or scaling it up to run on all videos just in your homepage suggestions, It'd get very expensive very quickly, even when only running it on popular channels.
Combine that with the fact that this is a free community project, I don't see this happening without a free way to invoke a LLM or many users donating (which I doubt will happen). Not to mention a relatively large delay of >~15s and possibly mediocre quality of the generated titles

ajayyy commented 11 months ago

While an interesting concept, these llms at the moment do not seem to like being concise, which makes the titles pretty bad

FireMasterK commented 11 months ago

I think the conciseness problem can be easily solved by fine-tuning the right prompt format.

Here's a prompt that I used:

``` The following transcript is from a YouTube video titled: GRANDMASTER GOTHAM!!!! You must suggest a concise, sanitized, and descriptive title for this video. ladies and gentlemen I make a lot of Chess content and I play a lot of chess games and one of the most common questions that I receive is why are you not a chess Grand Master the chess Grand Master title is the highest that you can achieve in the Chess World and I'm an international Master which is the second highest title that you can achieve in the Chess World I've made many many videos about this but in this video I'm going to give you a bit of an update uh on that Grand Master title also say hello to Benji in the background I'm going to also show you three games that I played today against Grand Masters in a chess.com tournament oh and the last game in particular is one of the most depressing chess games that you will ever see so please do stick around for all the games or jump ahead to the last one for tremendous heartbreak in any case let's jump into the games Billy Kimball I've played this dude before uh this is uh now I also understand his username by the way so that was the Dutch all right I'm playing my my sneaky Queen D4 Sicilian dude Bishop F4 Castle okay this is some approach I I beat like the quick A6 B5 stuff it's not bad it's very very risky yeah but against this like I think I can play a very quick E5 I think uh Knight H5 um maybe just something like Queen E3 is fine or G3 yeah maybe maybe bishop G3 so that if you take this Billy Kimbo was a crappy character I I enjoyed peaky blinders a lot I really did not enjoy the Russian season and I really did not enjoy I mean I kind of didn't enjoy season six the Italian season was and the first two seasons were great the first two seasons were great um so there's Queen G5 here but but I could play F4 and after take I can play Queen E3 okay that's of course a very natural move uh if Bishop so the idea is just to play B4 which I kind of I mean I think I have to respect I don't think I really have a choice okay he plays B4 anyway I'm a little bit I don't quite understand he might be getting a bit greedy here um I lose the pawn no matter what but maybe I just go for Pure development with something like HG yeah like I'm I'm I'm aware I'm giving him the pawn but maybe Knight D6 take take looks kind of nice so Pawn down but kind of cooking maybe on the dark squares F4 with tempo right here oh yeah no for sure yeah absolutely F4 with Tempo this is this is cool why did I enjoy Russian season it was just very very very boring it was just very slow and at the end I didn't even understand what the point was I watched the whole season and I was like okay the Italian season was nice I really like Luca chongrata I think he's the guy that plays him I forgot his name um no I actually quite like this position I think I think it's actually very difficult for black to play uh I don't know I don't know what he does not my responsibility though would a 1500 benefit from the book no not really you should not buy my book at that level if you're looking to improve after buying my book I mean it's it's a little bit more like a souvenir item I mean it's mostly like a merch purchase or something like it's just a cool thing to have and I'll sign it for you one day I think this is a good move just developing the night I don't see anything I could have done more forcefully and I don't think there was any reason to put my knight on this and then try to do that but maybe to H5 he's really thinking actually I think he really doesn't like his position he might play here and here I'm not really sure he could do anything else there it is um I can play Knight E5 so he's going to go for D5 right so let's go here I mean I think castling for him is way too dangerous unless I'm an idiot uh someone's calling our apartment I don't know for what oh that might have been a mistake it's okay oh hello all right we played the time Gambit against the Grand Master I'm not scared though I guess I should have put my Bishop on D3 originally to prevent castling all right hi Benji boy it's kind of a wild position was he done hi you're gonna take I go here I'm gonna win all his pieces momentarily his Rook is completely trapped nice foreign I'll take on D7 he can go here oh [ __ ] how did I blunder that oh my god what have I what oh I won anyway that was so bad how the [ __ ] did I lose my piece oh my God that was that vicious Rook C4 oh my God I mean I was just completely winning yeah I just that was awful okay so that game was good until it became a total disaster and then it was good again for a second the chess God smiled upon me this person actually very very good this is Maxim matlakov he's a top Russian Grandmaster and I've played him before without knowing what his username meant AKA Billy Kimba um but now I do now I know what it means uh and it's after I watch peaky blinders so in this game against him I played uh something that I've been playing recently Benji Benji stop licking your paw there's a lot of paw licking it gets all red anyway I played this variation uh I made a course on this recently and I I got a position very early where I played very aggressively and he won my Central Pawn but the cost of winning my Central Pawn was the following position where I really thought I had a great thing going on here and generally you know I'm an attacking player I understand Dynamic positions very very well uh and in this game I even called it I I said I think he has to do this I I don't think he has anything better and I wasn't overthinking and I was up 40 seconds on the clock and around here my uh the the the the phone rang uh and I had to go get it and I you know I lost a little bit of clock time but Bishop C4 was a lazy move I should have anticipated that he was going to Castle Benji Benji I've already asked you to stop stop thank you you guys can't see it you think I'm talk but it's very loud for those of you that have a pet you know that the pet sits there and goes and it's you know it's very distracting and also it's bad for his paws stop do not lick yourself pause or otherwise um Bishop to C4 um and you know the position that I got to around here I I was very I was actually quite confident because I had baited him into into castling and you'll notice that I'm down to 25 seconds that's because right around here I had to go get the door uh but I still was full of confidence like I thought I'm gonna play G5 and I'm gonna get a great position and right here I I thought I'm in cruise control I'm very confident I've got a great position and when I play confidently against these players good things happen to me okay um he went 95 and suddenly I was like wait his Rook is trapped and and he can't play D5 because I win his Queen this was a disaster of a move after something like Knight takes G4 I was even thinking to go here but I can just take because at the end of it I have another Fork so so I'm I'm just completely winning now I get a good position against the Grand Master Sky starts falling for me my heart starts pounding dude my hand this heavy on the mouse and I start realizing he's gonna give away the bishop and then right here I'm like I I shouldn't take the pawn because he's gonna get a big attack which is deranged thinking like that's just a free Pawn so I'm like okay I'm gonna bring my Bishop back and be safe I'm gonna trade the Queens not the best move you know now he gets this and so here I'm like I want to take but I can't do that because then he goes here oh but I can't do that because then I go here like just literally just just defend the pawn like you're just up a night but I got eight seconds and I panic and I immediately lose my advantage I immediately start throwing the game and by the way even here it's equal even here like I can move my Bishop somewhere that's not there like here and and and and you know I can go after this pawn and like the game is still going on instead I completely throw it I mean I play A3 which is just an insane move I lose that pawn I go here and then by a miracle of God my opponent Flags uh like I I could have very easily lost this game so in this game we got the Gotham special we got me playing an aggressive opening against the Grand Master getting to a position that I like and I and I think I have a feel for getting a fantastic situation on the board even a bit of a Time Advantage proceed to completely lose the time Advantage still get into a completely winning position panic and lose the position uh advantage and then I still manage to win somehow so I managed to win this game against like a former number 20 ranked player in the world the next two games that you are going to watch uh are are heartbreaking um in different ways uh let's start with the first one I'm playing Bach I've played Bach before no what what happened did you disconnect I see completion Page Six Bishop H6 is the professional variation um now E5 rookie eight uh I think against this I can just play A4 but King B1 was also fine this is all okay he's blitzing all his moves which is annoying but should be fine Knight F3 now let's go Knight F3 yeah we had this line but it was slightly different he didn't play Bishop H6 in that game um maybe I'll just play it aggressively like I I don't know maybe I'll go here he never played E5 which is like I've never seen this before which I guess happens when you play Grand Masters they play things you haven't seen before ig5 is interesting take Queen takes can't I also just go for the yeah I think this is fine very cautiously I think he can go here and then this then I have this take he has to take take it doesn't look like that works he does it but I don't what don't I just have Queen D4 what am I missing or Bishop D4 no but Queen D4 threatens mate so take okay so his idea is Bishop G7 isn't he just a piece down I'll just go here I guess Queen C6 wow I'm missing a lot of little things ah oh my God I have to get rid of that Knight am I just gonna be down two pawns at the end of all this like is that what's happening Jesus that whole combo must have worked tough chest is hard I just have nothing it's completely lost um so this game does this game doesn't actually need much of an analysis uh I played again this same exact Sicilian but in a slightly different move order uh and my opponent Benjamin Bach very good player by the way I mean he's 3000 rated like that's no joke same with matlakov by the way so the the reason I was struggling against Bach is he never played E5 like everything I know about this system involves black playing E5 he just never played it so we got to a position where you know like I I'm not an idiot I can make all of these moves he goes Bishop G4 and then essentially here I had this moment uh where I thought should I be aggressive and that's the best move according to the computer for example Bishop C4 Rook a C8 is block played and then I would play H3 Rook F1 so do you know why I didn't play Bishop C4 I will tell you the thought that went through my head many of you have similar thoughts like this when you play people who are stronger than you I swear to you I swear to you I did not play Bishop C4 because I in my mind I went he's gonna go here and in some positions my king is going to be vulnerable when he sacrifices Like Mike what huh what the hell am I talking about like what am I even thinking about now the funny thing is after Bishop C4 Rook C8 Rook C4 is the top computer move and then black gets some attack and white is defending um so I played passively because I'm playing a good player and then here I literally anticipate this move I saw all of this was coming and I miscalculate and and and you know uh he goes here and I go here and and I see all of this and he does this and and and of course I miss King takes I can I can apparently play King takes sack my queen and win and you know win a bunch of material but I didn't see that uh you know I I calculated this far and he calculated better than me because he's a better player and I'm just completely lost um Benjamin Bach plates and I resigned here because uh it's not a matter of losing this it's a matter of I'm just down three pawns after so I'm gonna lose now Benjamin Bach in this game played with an accuracy of 98.9 when I first looked 99.8 he actually played 99.8 he got two brilliant moves which was this and this and his ELO and chess.com is estimated at 3650. yeah you know what I can't even be mad about that all right I got slaughtered and this game is a fantastic example of why I am not a GM uh he played an opening that he knew slightly more new ones than I did maybe uh and then uh and then I didn't play you know the the best move because I was trying to be passive because I was playing a good player uh and then I just miscalculated we went down a five move sequence he saw six moves and after this entire sequence of moves he is just winning can't do anything about that but if you made it this far in the video uh the last game is depressing I'm going to show you the last game now please watch it until the end I don't even yeah just watch four like Caro he plays Cairo another guy who played Knight C3 by the way the last guy the last guy also played um played uh something like that uh he plays F3 so now we basically have a fantasy variation uh I can just play Knight F6 I can also play Queen A5 Magnus recently played this queen A5 move I'm gonna play Queen A5 Magnus played uh Queen F5 check I mean A6 is not correct in magnus's line but if Magnus played Queen A5 against shimanov he would win so have I beat a grand master before no never no it's been a dream of mine to be the Grand Master but I I've never defeated uh I've never defeated a grand master they're they're just too strong this might be this might be my first one though so um yeah I mean I'm hoping to beat my first one hey please Bishop B3 uh I have six maybe I can take take Knight F6 kinda don't mind that uh this is the idea and then also this is the idea it says seven seven yeah yeah it's yeah yeah seven seven yes that's the score between me and this guy yes would be cool to add A6 to the cat you just want more free stuff that's all you want you want free stuff and you want 50 off constantly that's what you guys want you want me to add things to my courses lifetime like forever and you also want to pay a massive discounted price that's because that's just what you guys are you know I be Hikaru once um no I didn't what are you talking about and a hug you guys are like where's your next Shale speaking of which sales are going away as as intense as they meaning the degree of which that we that we run them will be lessening um I've talked about this recently maybe I don't remember but we're not just going to run violent amounts of sales promotions all the time we are no I did not beat Hikaru once I don't know why you just yeah I mean I I appreciate you guys trying to make things up to make me feel nice but that never happened Free Pawn Free Pawn and I think he's just pretending like it wasn't free like I think he's just he's just playing a Gambit that's what he's doing he's just like yup I lo I lost the pawn yep that's that was me um he's just he's just like oh God yeah part of my opening uh actually his position is not is not so bad I don't think I mean it's gonna take me a while to get to safety uh do I go for G6 what what do I go for here oh it's a stressful position D6 looks stressful by Castle long yes this is very dangerous and very unnecessary yeah I can come back like he's gonna play a3b4 and stuff will he hang a bishop that would be kind of cool I don't think he's gonna hang a bishop but that's like one of the only ways I can beat good players is if they just hang all their pieces I really want to go E6 to free up my position but I but I oh this is interesting this is a really interesting move I really want to play this I'm gonna play it I'm gonna Gambit my Pawn to open up my G file he takes because you know he was down upon and now he's not that makes sense now let's play F6 I think maybe he'll go back and uh and we're just gonna I guess we're just gonna go for it that's what we're gonna do in this game going forward involves making a move though so I should probably I should probably get started with that uh night takes F3 Knight E5 you realize I was being sarcastic about what Grandmasters I wasn't being sarcastic you guys always accuse me of the weirdest things sarcasm I don't remember ever beating a Grandmaster apparently I didn't I don't know couldn't be me if this can I just take it's just free right like one of the ways of playing against like players is just to take free stuff and maybe that's what I'll do in G4 oh then there's Rook G3 Queen H4 this looks active I'm gonna go here it's a very professional move [Music] pork E6 I'm very close to blundering like the entire game away but I'm gonna do my best not to do that six Rook G8 take take I'm gonna go here oh here he's gonna take I'm gonna take he's gonna play Queen E5 and he's going to be a pawn up but ah what okay clearly he liked this or else he wouldn't have gone for it I don't really like this for him but let's anchor on the dark squares F7 just something like H4 what 55 I can take what I would do to have my diagonal open for my Bishop oh my goodness his Knight is hanging I hung my Rook oh I got too nervous and I hung my Rook oh my God oh my God oh my God that's this game broke me um I I ended the stream immediately and I and I just took Benji for a walk for like an hour uh because alternatively I probably would have set fire to my own house um or my computers at least I I I I mean there's not much to summarize in this game I mean it was a good fight it was a very interesting and imbalanced game and I've played shimanov before I've even beaten him before and you know it I I freestyled the opening he sacrificed the pawn I was playing you know Loosely and freely and I played this interesting move G5 which was a very interesting idea to create some sort of you know imbalance in the position and and I was worse like I I definitely felt like I had a worse position but I was fighting and I was keeping even on time we both had 20 seconds you know the game kind of uh devolved into total chaos here when you know I'm just trying to hang on and then we get into a crazy scramble right like right here and suddenly I'm like I'm winning uh I'm winning this game I give a check I shuffle the pieces and he initiates an exchange here which loses him a knight and I am not exaggerating when I couldn't feel my body like the overwhelming pressure of knowing that I was going to win this game I couldn't move the anxiety that I felt is unlike it's almost as bad as anxiety of when I'm about to get a a blood draw I am deathly terrified of needles uh I will faint like 20 of the time the depressing reality of just scrambling through this end game and barely being able to make my moves and then you know why I did all of this the psychological fear that if I don't have all my pieces defended I'm going to lose them that's why I did that because it felt safe it's so crazy how nervous I am this is an unfathomably unusable position you can't lose this position unless you get so nervous you just pick up the closest piece and move it to a square and now I'm just lost um there is a way to defend this position probably like if I play actively with my king because my Bishop defends everything um but I immediately let in his King and now this is just and and I just resigned and I um so I almost did this exact same thing in the very first game uh in the very first game if my opponent hadn't lost on time he would have won the game because I would have panicked and I would have lost uh and and this is just like a an insane sobering reality every time I play this tournament that I am just not good enough and I don't know if I ever will be uh and and when I choose to come back to con and this justifies why I retired and I'm not playing in tournaments the pain that I feel from these games and and it's just it's just it's too much um so that's the GM update for now and I thank you for making it this far in the video um wish I could have better news but I for now get out of here Suggested title: ```

Output: "Chess Grandmaster Tournament Recap: Wins, Losses, and Heartbreak"

The results were quite good for this specific video, considering I found thinking of a title for this video quite difficult. :sweat_smile:

I also think the API charges can be entirely avoided if you spoof requests to ChatGPT's endpoint from the extension if the user is logged in to OpenAI. (Although, I'm not sure what context length the free version has) Other LLMs, such as Bard, Claude, and Huggingface chat could be considered too.

This could be helpful when suggesting a title to submit I suppose.

leumasme commented 11 months ago

I also think the API charges can be entirely avoided if you spoof requests to ChatGPT's endpoint from the extension if the user is logged in to OpenAI.

The ChatGPT UI is limited to a much smaller per-message size than the API - just try pasting your prompt into there. I expect that this will work for none but the shortest of videos. It also requires more extension permissions, adds rather significant complexity (I'd expect), and enters the fight with OpenAI, who will most likely not like extensions doing this.

neoOpus commented 9 months ago

I was just about to suggest this

neoOpus commented 9 months ago

I also think the API charges can be entirely avoided if you spoof requests to ChatGPT's endpoint from the extension if the user is logged in to OpenAI.

The ChatGPT UI is limited to a much smaller per-message size than the API - just try pasting your prompt into there. I expect that this will work for none but the shortest of videos. It also requires more extension permissions, adds rather significant complexity (I'd expect), and enters the fight with OpenAI, who will most likely not like extensions doing this.

There are alternative LLMs with more tokens, it is also possible to distribute the task and divide it into small chunks, and of course, target just the popular videos to begin with... there is a great margin of optimization here and using the right APIs and hacks can get things done properly.

leumasme commented 9 months ago

There are alternative LLMs with more tokens

I have tested claude, which prominently advertises its 100k token limit.
It fails to answer rather basic questions about an uploaded ~35k token script when the answer to those questions is near the middle of the script, strongly displaying the Lost in the Middle effect that is already seen in gpt3.5.
Of course you can split up the script to create independent summaries and then generate a title based on those, but we can expect quality likely suffering further from this. I would say it remains to be tested, but even if this worked well, it would not really be feasible considering the monetary or resource cost attached to invoking a LLM, especially at this scale and even when only processing the most popular videos.
Unless an organization appears that would like to support this project by providing gpu servers or free access to a LLM api, I doubt looking further into this makes any sense. I strongly doubt that a corporation like OpenAI would to stand by idly if you were to abuse the free version of chatgpt through extension users at this scale.

And even with all this - alternate video titles is only half of what the extension provides.

MasterKia commented 9 months ago

Combine that with the fact that this is a free community project, I don't see this happening without a free way to invoke a LLM

See: https://github.com/xtekky/gpt4free/issues/40 https://github.com/xtekky/gpt4free/issues/802

https://github.com/xtekky/gpt4free

neoOpus commented 9 months ago

This prompt could maybe make this possible

Auto Split Prompt Splitter prompt is a text that will be used when the user prompt in divided into chunkc due to the character limit.

Act like a document/text loader until you load and remember the content of the next text/s or document/s.
There might be multiple files, each file is marked by name in the format ### DOCUMENT NAME.
I will send them to you in chunks. Each chunk starts will be noted as [START CHUNK x/TOTAL], and the end of this chunk will be noted as [END CHUNK x/TOTAL], where x is the number of current chunks, and TOTAL is the number of all chunks I will send you.
I will split the message in chunks, and send them to you one by one. For each message follow the instructions at the end of the message.
Let's begin:

Auto Split Chunk Prompt Chunk prompt is a text that will be added to the end of each chunk. It can be used to summarize the previous chunk or do other things.

Reply with OK: [CHUNK x/TOTAL]
Don't reply with anything else!

Borrowed from Superpower ChatGPT

TomLucidor commented 7 months ago

Can subtitle download and LLM processing be done locally? For the document length issue, are there RAG-like alternatives?