lindylearn / unclutter

A modern reader mode and article library for your browser.
https://unclutter.lindylearn.io
GNU Affero General Public License v3.0
1.24k stars 54 forks source link

(python) API #332

Open ddddavidee opened 2 years ago

ddddavidee commented 2 years ago

Hi, would it be possible to use this extension in a programmatically way ? I really like how pages are rendered, but I have a quite long list of urls I would like to save rendered, and I would like to get them with a (hopefully, python) script. Is it possible?

phgn0 commented 2 years ago

@ddddavidee sorry I saw this issue only just now.

To clarify, you want to download the HTML (or maybe PDF) of the "uncluttered" articles pages? Why don't you open the links in your browser (maybe it should work offline)? Where do you save the list of URLs currently?

ddddavidee commented 2 years ago

Sometimes reading a blog, I'd like to save that page and all referenced links. I can open and save it one by one but sometimes I've dozens of them and scripting the whole thing would be nicer

phgn0 commented 2 years ago

I'm curious, why do you want to save articles as HTML or PDF files instead of saving just the links?

Right now Unclutter doesn't work outside the browser. But there's a way to programmatically control Chrome & unclutter pages this way, even if it takes some effort to set up. I use this method to capture screenshots of a few 100 articles to test the extension. This uses Typescript though, and only creates a screenshot of the first article page: https://github.com/lindylearn/unclutter/blob/main/serverless-screenshots/cloudrun/src/index.ts

ddddavidee commented 2 years ago

I would store some pages. (If ever they're removed from the internet) And I usually unclutter some articles I study, so saving the html on disk allows me to create archives on a particular topic for later revision...

Moreover having the html, I can create a booklet and print it to PDF and to paper.

ddddavidee commented 1 year ago

I would store some pages. (If ever they're removed from the internet) And I usually unclutter some articles I study, so saving the html on disk allows me to create archives on a particular topic for later revision...

On Mon, 19 Sep 2022, 14:42 Peter Hagen, @.***> wrote:

I'm curious, why do you want to save articles as HTML or PDF files instead of saving just the links?

Right now Unclutter doesn't work outside the browser. But there's a way to programmatically control Chrome & unclutter pages this way, even if it takes some effort to set up. I use this method to capture screenshots of a few 100 articles to test the extension. This uses Typescript though, and only creates a screenshot of the first article page: https://github.com/lindylearn/unclutter/blob/main/serverless-screenshots/cloudrun/src/index.ts

— Reply to this email directly, view it on GitHub https://github.com/lindylearn/unclutter/issues/332#issuecomment-1250968886, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMVOIKPZ3BQ5DFB65A3EZLV7BNTFANCNFSM6AAAAAAQINYH2E . You are receiving this because you were mentioned.Message ID: @.***>

phgn0 commented 1 year ago

A related issue for saving individual PDFs natively: https://github.com/lindylearn/unclutter/issues/704

Still not a lot of progress on exposing the uncluttering algorithm as a package :(