ZachSaucier / Just-Read

A customizable read mode web extension.
https://justread.link/
1.19k stars 136 forks source link

Add ChatGPT summarization on click #424

Closed jensolafkoch closed 11 months ago

jensolafkoch commented 1 year ago

Is your feature request related to a problem? Please describe.

I have to read an entire article - I just want a summary. There AI or ChatGPT browser extensions wehich can help, but I want it right there where I try to minimize reading time or paper usage when printing.

Describe the solution you'd like

A setting to a add a private API key for ChatGPT. Some guiding settings for how long and in which style and language the summary should be. A button to start. You have all the stripped down bare info for the ChatGPT context windows at hand. And the user can strip it down further. Perfect!

Describe alternatives you've considered Full-blown ChatGPT extensions. Direct use of ChatGPT Plus plugin mode with LinkReader and co.

Additional context None.

ZachSaucier commented 1 year ago

Hey Jens, thanks for posting. This is an interesting idea. I've never integrated with OpenAI or any other ML language model before. Does this sound like a reasonable approach to adding this feature?

The potential issue with this approach is that it might expose user's API keys? Obviously they'd be exposed to me but I'm not malicious and wouldn't keep track of them but I am not sure if they'd be exposed in a request to the Just Read server. Any idea?

EDIT: This is incorrect as to the actual approach. See below for the actual approach.

jensolafkoch commented 1 year ago

I think that sounds like a reasonable approach. You have to choose the appropriate model in the API call and I would suggest using a system prompt to describe the role an LLM should follow. You also have to think about text length and context window length. But these will grow in the future, LLM Claude from Anthropic gets a 100K token context window and context windows sizes with Open AI are growing a bit as well (mostly between 4K and 32K depending on model).

Well, I also asked (anonymously of course) ChatGPT about such an approach, as I never programmed something similar myself. Here's the answer:

*

Your approach to integrating an ML language model like OpenAI into a browser extension is generally sound. However, you're correct to be concerned about the potential security risks, especially regarding the exposure of user's API keys. Here are a few points to consider:

Storing API Keys: Storing user's API keys on the client-side (browser extension settings) is risky. If a user's machine is compromised, their API key could be stolen. Moreover, sending API keys over the network to your server also poses a risk, even if the connection is secure (HTTPS), as it could potentially be intercepted.

Server-side Requests: Making requests to OpenAI from your server is a good practice as it allows you to control the rate of requests, handle errors centrally, and add an additional layer of security. However, using individual user's API keys for this purpose is not ideal. Instead, consider using a single API key (your own) to make requests to OpenAI. This way, user's API keys are not exposed.

User Customization: If you want to allow users to customize the prompt, you can still do so without requiring their API keys. The custom prompt can be sent from the client to your server, and your server can use this prompt when making the request to OpenAI.

Privacy and Trust: Even if you're not malicious and have no intention of misusing user's API keys, it's important to remember that users have to trust you with their keys. This can be a significant barrier to user adoption, especially for security-conscious users.

Rate Limiting: If you use a single API key, you'll need to be mindful of OpenAI's rate limits. If your app becomes popular, you might hit these limits quickly. You might need to implement some form of queueing or rate-limiting on your server to manage this.

Terms of Use: Be sure to review OpenAI's terms of use. They may have specific rules about how their API can be used, and whether API keys can be used in the way you're proposing.

In summary, while your approach is technically feasible, it's important to handle API keys with care to protect your users. Consider using a server-side approach with a single API key to mitigate these risks.

*

Problem with using your own API key server-side instead of user keys would be costs of course ... but people who have their own API account usually know that there is some risk involved. Since you can set a Usage Limit at OpenAI the financial risk can be mitigated. And because one can generate different API keys for different apps you can revoke individual keys without compromising your other usages.

I could also imagine, after such an offer would be part of JustRead to also choose the output language as well (setting with options: use source language + array of other languages the LLM does know). And some paramters like style, length would be useful. (Maybe later in a second step.)

There is a nice app called AudioPen which does a lot of similar things on recorded audio notes. Maybe worth to check out.

ZachSaucier commented 1 year ago

Thanks for the input.

I'm quite hesitant to use a single API key because:

  1. There's additional financial risk for me since it's my API key but other people are using it. If there were a bad actor who was able to get around the protections I put in place they could potentially rack up big charges. Frankly I don't earn very much from Just Read so having a risk of a big charge is probably not worth it for me.
  2. If I used a single API key I'd have to make it a premium feature and potentially charge more because of the additional cost to me. If I make users provide a key, the cost to me is significantly lower because it's just receiving a request, sending it out, and then passing the response back to the client.
  3. While it's less secure for users to store their API client side and send it to the Just Read server, the risk of exposure is quite low. I believe that an attacker would either have to have access to their device, in which case they could probably get much more lucrative information, or access to the Just Read server, which is securely protected. Reading more about the transportation to the Just Read server, there seems like a very low chance of it being exposed in the request since it uses HTTPS. The main risk that users would have sharing their API key is trusting me, aka Just read, with the API key since I would have access to all of them. That's a trust I'm willing to have people take. Plus, users should only use an API key that they generate just for Just Read, so it shouldn't put any other API keys of theirs at risk.

EDIT: This is incorrect as to the actual approach. See below for the actual approach.

As for adding translations, I think that's beyond the scope of Just Read because you can already translate a page before using Just Read and Just Read will use the translated content.

jensolafkoch commented 1 year ago

Absolutely agree.

Maybe I wasn't clear enough, but that's what I meant here, that every user would and should and could user their own API key as opposed to the financial risk of using your own API key.

... but people who have their own API account usually know that there is some risk involved. Since you can set a Usage Limit at OpenAI the financial risk can be mitigated. And because one can generate different API keys for different apps you can revoke individual keys without compromising your other usages.

ZachSaucier commented 12 months ago

It turns out I don't need to involve my server at all! So that's better for security and people don't have to trust me with their key.

I'll update this soon once I deploy the changes to the extension to add this functionality. In the mean time, does this doc make sense? I am also going to update the README a bit but this is the main place for info about this feature.

The actual approach ended up being:

ZachSaucier commented 12 months ago

Actually, if you wouldn't mind testing it out by loading Just Read from a local version I would really appreciate it! Any feedback you have about the functionality or unclear parts of the documentation would be greatly appreciated.

jensolafkoch commented 12 months ago

Installing in Firefox gives a corrupt file warning:

File F:\Downloads\Just-Read-master.zip does not contain a valid manifest

Chrome an error like:

“Manifest version 2 is deprecated, and support will be removed in 2023. See https://developer.chrome.com/blog/mv2-transition/ for more details.”

jensolafkoch commented 12 months ago

Hi Zach, the docs sound good to me. And it works fine.

Are you setting a default value for max_tokens? Docs say nothing about a default here? (You could also add an example of 0.2 for temperature so that everyone know how to format the numbers in between.)

Interestingly to get a German answer I had to explicitly ask for that in the prompt. ChatGPT usually picks up that information from the prompt language in most cases. Probably this behaviour is a consequence of temperature = 0.

I would also add a note to the prompt customization on how to get the summary in any language. But if you do so, you should have the model translate “Summary” in the target language as well. Looks odd otherwise.

Then I strongly believe there should be a Boolean option whether one wants the summary to replace the current content or to be placed at the top (as of now). I often print things and I don’t want to delete all the original stuff manually. Without that replacing option that would be stopping midway ... ?!

ZachSaucier commented 12 months ago

Thanks for all of the comments! Very helpful.

Manifest version 2 is deprecated

Back when Chrome announced migration to manifest v3 I updated Just Read on a separate branch. However they didn't actually support v3 at the time 🙃. So I left the branch stale for the time being. Now I'll update the current version to be v3.

Installing in Firefox gives a corrupt file warning

Yeah, Firefox requires a slightly different version of the manifest files. So When I upload to Firefox I have to change a thing or two.

Are you setting a default value for max_tokens? Docs say nothing about a default here?

No, not currently. The maximum is just the model's maximum. I can note that in the doc.

add an example of 0.2 for temperature so that everyone know how to format the numbers in between

Good idea.

Interestingly to get a German answer I had to explicitly ask for that in the prompt. ChatGPT usually picks up that information from the prompt language in most cases. Probably this behaviour is a consequence of temperature = 0.

Hmm. I will have to investigate this more deeply.

I would also add a note to the prompt customization on how to get the summary in any language. But if you do so, you should have the model translate “Summary” in the target language as well. Looks odd otherwise.

Yeah, that's a good idea.

Translating "Summary" is actually more difficult than it seems because it may require a separate request or adding to the prompt that someone provides. Neither option is great... Let me try to think of alternatives.

I strongly believe there should be a Boolean option whether one wants the summary to replace the current content or to be placed at the top (as of now).

This is a good idea, I'll add it.

Without that replacing option that would be stopping midway ... ?!

What do you mean by this?

jensolafkoch commented 12 months ago

Re. translating „Summary“: Maybe you can use the system prompt to add an instruction that the completion should always be topped by the translation of the word “Summary” in whatever language the main completion will be provided, enclosed by some special characters/string which you can use to extract it.

Please forget about “Without that replacing option that would be stopping midway ... ?!” – it was just an afterthought to the Boolean toggle I was talking. :-)

ZachSaucier commented 12 months ago

I added the replace checkbox, updated the summarizer doc page, and upgraded to manifest v3. I will probably publicly release this later today. Thanks again for the help!

jensolafkoch commented 12 months ago

Looking forward to the release! :-)

jensolafkoch commented 11 months ago

Just re another idea: Is there any chance to add a toggle to strip down any kind of image like the Reader View in Firefox can do?

ZachSaucier commented 11 months ago

Is there any chance to add a toggle to strip down any kind of image like the Reader View in Firefox can do?

Do you mean remove all images like this? https://github.com/ZachSaucier/Just-Read/issues/222

What's the use case for having a toggle to remove the images?

jensolafkoch commented 11 months ago

Yes, like that.

I usually use tools like Just Read to minimize the amount of paper for printing. I hate reading longer articles online, but I very often (but not always) don’t need to print the images (saves a lot of toner as well).

Adding CSS won’t do the trick as I need a toggle to switch the view.

ZachSaucier commented 11 months ago

I'm pretty hesitant to add a toggle just for this purpose. You can already:

  1. Create a theme with img { display: none; } (or just for print via @media print { img { display: none; } }) and switch themes to that theme.
  2. Use Just Read's deletion mode to delete elements from the page that you don't want.

With that being said, maybe there's a way for me to add all themes that someone has to the style panel (what opens when you click the paint brush icon) instead of just the two default ones? That would solve this need for you, right? You could create a theme called "print-without-images" or something and just select that when you want the images to be removed.

ZachSaucier commented 11 months ago

Oh, it looks like it can already do that 😄

So yeah, I recommend just creating an extra theme that hides what you want when you want to print and going to that theme using the style panel when you need to remove the images (or whatever else you want to hide).

jensolafkoch commented 11 months ago

Yes, that would work, of course. I already made my own new default theme, but these changes are unrelated to images.

I still need to switch between two themes via style panel, and that nevertheless requires 3 clicks instead of 1. If you just want to check if that’s the better option, it’s 6 instead of 2. That’s the UI beauty of toggles... :-) Doing that very often is quite tedious.

Why not having two different prioritized themes (“favorites”, so to say) with whatever different styling, and have an icon to switch between them? Icon could just show a big “1” or a big “2”...

ZachSaucier commented 11 months ago

To me, the tradeoffs of the additional complexity in the UI that a toggle like this would add is not worth the benefit. Most users don't need to toggle between themes to my knowledge. If they do, using the style panel to do so doesn't take very long.

You're welcome to make a separate feature request for this toggle to try and gauge interest from other users but I am not inclined to add that feature at this time.

ZachSaucier commented 11 months ago

I have started the publishing process for the build with this feature. I'm going to mark this request as complete. Let me know if you encounter any issues.

Thanks again for suggesting the feature!

jensolafkoch commented 11 months ago

No problem, I’ll see how often I have to switch, and will then decide re another feature request.

Von: Zach Saucier @.> Gesendet: Sonntag, 23. Juli 2023 15:27 An: ZachSaucier/Just-Read @.> Cc: Jens Olaf Koch @.>; Author @.> Betreff: Re: [ZachSaucier/Just-Read] Add ChatGPT summarization on click (Issue #424)

To me, the tradeoffs of the additional complexity in the UI that a toggle like this would add is not worth the benefit. Most users don't need to toggle between themes to my knowledge. If they do, using the style panel to do so doesn't take very long.

You're welcome to make a separate feature request for this toggle to try and gauge interest from other users but I am not inclined to add that feature at this time.

— Reply to this email directly, view it on GitHub https://github.com/ZachSaucier/Just-Read/issues/424#issuecomment-1646840679 , or unsubscribe https://github.com/notifications/unsubscribe-auth/ALLL5Y2OECWWFAIYEAUE7Q3XRURA7ANCNFSM6AAAAAA2G4264I . You are receiving this because you authored the thread. https://github.com/notifications/beacon/ALLL5Y5FZSRB2EIVGUFRKB3XRURA7A5CNFSM6AAAAAA2G4264KWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTTCFDFWO.gif Message ID: @. @.> >