polywock / ask-screenshot

Issue Tracker for GPT Screenshot extension.
6 stars 0 forks source link

Add text extraction? #5

Open oaustegard opened 2 weeks ago

oaustegard commented 2 weeks ago

While this is really neat and a quick shortcut, would be even better if you could add text extractor, since the OCR of Claude seems to not capture all the text of a screenshot. Today I use a bookmarklet for this (converting page to Markdown using turndown.js), but an extension like yours would definitely be more convenient

polywock commented 2 weeks ago

Hello. Seems like a useful feature. How about a mode that includes document.body.innerHTML as an attached file? Is conversion to markdown necessary?

oaustegard commented 2 weeks ago

The challenge with the full html is it can get very lengthy and include a lot of CSS and JavaScript that is just noise to an LLM. Inner Text could work perhaps

On Mon, Jun 24, 2024 at 7:49 PM polywock @.***> wrote:

Hello. Seems like a useful feature. How about a mode that includes document.body.innerHTML as an attached file? Is conversion to markdown necessary?

— Reply to this email directly, view it on GitHub https://github.com/polywock/ask-screenshot/issues/5#issuecomment-2187646024, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAY6BA2P5SY2QEIN6OTJZ3DZJCWA5AVCNFSM6AAAAABJ2MFVYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBXGY2DMMBSGQ . You are receiving this because you authored the thread.Message ID: @.***>

polywock commented 2 weeks ago

I was worried that including Turndown will balloon the extension's size too much, but it's very small.

In this update, I've included a new Mode called "Page data". Assuming you're on Chrome/Edge, you can try it out by...

  1. Extract the packed.zip into folder.
  2. Go to chrome://extensions
  3. Enabling Developer mode
  4. Click "Load unpacked" and load the extracted folder.

packed.zip

A few issues

oaustegard commented 2 weeks ago

My bookmarklet is far less advanced than your extension and merely opens the text in a new window for me to copy and paste which I can then do selectively.

I also do some guesswork at extracting only the body/main content of the page; doesn’t always work though: see remEls in https://github.com/oaustegard/bookmarklets/blob/main/markdown_body.js for the logic. Could certainly be improved.

(Most of these bookmarklets were generated with ChatGPT or Claude)

On Wed, Jun 26, 2024 at 2:14 AM polywock @.***> wrote:

I was worried that including Turndown will balloon the extension's size too much, but it's very small.

In this update, I've included a new Mode called "Page data". You can try it out by...

  1. Extract the packed.zip into folder.
  2. Go to chrome://extensions
  3. Enabling Developer mode
  4. Click "Load unpacked" and load the extracted folder.

packed.zip https://github.com/user-attachments/files/15983186/packed.zip

A few issues

  • Sometimes the page data is too large and doesn't fit Claude's context limit. How do you get around this?

— Reply to this email directly, view it on GitHub https://github.com/polywock/ask-screenshot/issues/5#issuecomment-2190815547, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAY6BA3IW43I257JD4S5XX3ZJJL2TAVCNFSM6AAAAABJ2MFVYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJQHAYTKNJUG4 . You are receiving this because you authored the thread.Message ID: @.***>