josStorer / chatGPTBox

Integrating ChatGPT into your browser deeply, everything you need is here
MIT License
9.94k stars 749 forks source link

on youtube we use low quality subtitles not the high quality transcript #800

Open wassname opened 6 days ago

wassname commented 6 days ago

Describe the bug 问题描述

A clear and concise description of what the bug is.

On youtube, this extension uses subtitles not transcript. The subtitles are terrible, and lead to the llm giving poor output

To Reproduce

如何复现

  1. go to https://www.youtube.com/watch?v=IYaNscnE7rc&t=556s
  2. run ChatGPTBox
  3. look at the summary
  4. open summary in separate window
  5. look at the inputs the summary
  6. go back to the video, open the transcript and compare

It seems that this extension is using the subtitles, not the transcript. But the subtitles often have much poorer transcriber model and uncommon words are totally missed.

For example, for this video

Expected behavior 期望行为

A clear and concise description of what you expected to happen.

This is part of the transcript available in the UI

it is now a matter of public record that under pompeo's explicit Direction the CIA Drew up plans to kidnap and to assassinate me within the Ecuadorian Embassy in London and authorized going after my European colleagues subjecting us to theft hacking attacks and the planting of false information my wife and my infant son were also targeted a CIA asset was permanently assigned to track my wife and instructions were given to obtain DNA from my six month-old son's nappy

And this is the subtitle information received in ChatGPTBox

it is now a matter of public,kidnap and to assassinate me within the,hacking attacks and the planting of,assigned to track my wife and,nappy

As you can see it's a poor source of informaiton

Please complete the following information): 请补全以下内容

wassname commented 6 days ago

related: https://github.com/josStorer/chatGPTBox/issues/679

Mohamed3nan commented 5 days ago

I believe there is a bug related to retrieving information about the model's context window length and some logical calculations. For example, I used the web version of ChatGPT-4o, which is probably 8k, but I received a maxLength of only 900.

image

I’m trying to understand how it works, but it’s quite complex. https://github.com/josStorer/chatGPTBox/blob/0ee357d03e779a467a9ff5bd472f99aae04cc309/src/utils/crop-text.mjs#L31-L101

@josStorer Perhaps we should simplify it by creating a new key, such as length https://github.com/josStorer/chatGPTBox/blob/0ee357d03e779a467a9ff5bd472f99aae04cc309/src/config/index.mjs#L166-L268

wassname commented 5 days ago

Site note: I think there are two separate bugs? There's the one where it uses the transcript (I have a small example it above, the length AND contents are different), and the one you are investigating where it wrongly clips the transcript.

Mohamed3nan commented 5 days ago

Site note: I think there are two separate bugs? There's the one where it uses the transcript (I have a small example it above, the length AND contents are different), and the one you are investigating where it wrongly clips the transcript.

The function cropText is cropping the transcript based on the model's context length. From what I understand, if the transcript is too long, it takes a portion from the start and a portion from the end, then performs some calculations to find a balance in between. This is likely why the transcript feels short.

Mohamed3nan commented 5 days ago

By the way, if you want a temporary fix for only YouTube without considering the consequences, you can simply return the prompt here without the cropText function

https://github.com/josStorer/chatGPTBox/blob/0ee357d03e779a467a9ff5bd472f99aae04cc309/src/content-script/site-adapters/youtube/index.mjs#L56-L59

wassname commented 5 days ago

Hmm let me look in the debugger... yes I see what you mean

an extract from subtitleContent in the debugger before cropText

it is now a matter of public,record that under pompeo's explicit,Direction the CIA Drew up plans to,kidnap and to assassinate me within the,Ecuadorian Embassy in London and,authorized going after my European,colleagues subjecting us to theft,hacking attacks and the planting of,false information,my wife and my infant son were also,targeted a CIA asset was permanently,assigned to track my wife and,instructions were given to obtain DNA,from my six-month-old son's,nappy

and croppedText (after applying croptext)

it is now a matter of public,kidnap and to assassinate me within the,hacking attacks and the planting of,assigned to track my wife and,nappy

If anyone would like to reproduce this, here are the full arguments to croptext

wassname commented 5 days ago

I guess the cropping is a wider issue, the ideal way to crop must not be to skip random parts of sentences, that would lead to incoherent text. It's to chunk the text (https://js.langchain.com/v0.1/docs/modules/data_connection/document_transformers/), perhaps keeping the beginning and end (and to tell the model that it's an incomplete text as well so it doesn't misrepresent this to the user).