gildas-lormeau / SingleFile

Web Extension for saving a faithful copy of a complete web page in a single HTML file
GNU Affero General Public License v3.0
14.3k stars 945 forks source link

Save a part of the page identified by tag and/or id #1391

Open frkd-dev opened 4 months ago

frkd-dev commented 4 months ago

Is your feature request related to a problem? Please describe. Usually, I archive articles from news/research websites to use them later in my researches or just for reference. On such websites I often don't need the whole webpage with headers/footers/menus/etc, but only the content of the article. When I visit a fresh article/post and find it worth saving, then I select the very same part of the page I selected many times before. I'm finding myself repeating identical steps on identical pages again and again. Useless waste of time especially when the page is long enough so the scrolling to the end is necessary.

A good example I got today: Intel has recently handed over the whole line of products to another company. The new company doesn't publish any support articles for my product as Intel did, and I believe Intel is going to remove them soon too. Now I'm opening every article on Intel's website, selecting the same part of the page with the valuable content, and saving the selection. I repeat this for tens and tens of pages.

Describe the solution you'd like It would be amazing to have an option for saving only a part of the page identified by an HTML tag and/or element ID. For instance, the CSS notation "tag#id" can be used. If it’s set, then the extension saves only a specified part of the page, otherwise fallbacks to entire one.

In conjunction with the Profiles and Auto-settings rules, this would drastically help in day-to-day activities on often visited websites.

gildas-lormeau commented 4 months ago

Thank you for the suggestion. I'll have to think about it. The problem from my point of view is that if it's an option then the user risks creating a lot of profiles or changing the value of this option very often.

An interesting alternative for this use case is integration with userscripts. By enabling the hidden option userScriptEnabled, see https://github.com/gildas-lormeau/SingleFile/wiki/Hidden-options, you can create a simple userscript that will be adapted to the page to be saved.

Here is below an example of such a script which is designed to save pages on "Le Monde" as https://www.lemonde.fr/international/article/2024/02/20/julian-assange-tente-d-obtenir-un-dernier-recours-contre-son-extradition-vers-les-etats-unis_6217460_3210.html

// ==UserScript==
// @name         New Userscript
// @version      2024-02-20
// @author       Gildas
// @match        https://www.lemonde.fr/international/article/*
// @grant        none
// ==/UserScript==

(() => {
    "use strict";

    dispatchEvent(new CustomEvent("single-file-user-script-init"));

    addEventListener("single-file-on-before-capture-request", () => {
        const element = document.querySelector(".zone--article");
        isolateElement(element);
    });

    function isolateElement(element) {
        const parentNode = element.parentNode;
        Array.from(parentNode.childNodes).forEach(node => {
            if (node != element) {
                node.remove();
            }
        });
        if (parentNode != document.body) {
            isolateElement(parentNode);
        }
    }
})();
frkd-dev commented 4 months ago

I had an idea of using user scripts to prepare the page once I click a button, but custom events from a SingeFile are even better as no extra steps are required! I wasn't aware of custom events and hidden options. This will close my needs for now. Thanks for sharing! 👍

Regarding the script: where can I read why the extension needs a dispatchEvent() from the user script?

UPD: found the answer here.