gildas-lormeau / SingleFile

Web Extension for saving a faithful copy of a complete web page in a single HTML file
GNU Affero General Public License v3.0
14.63k stars 961 forks source link

Is there really no way for SF to open a file from the filesystem or read its content? #720

Open user0022 opened 3 years ago

user0022 commented 3 years ago

Hello!

Browser: Firefox. OS: Lubuntu 18.04.

Browser limitation problem: SF can not access filsystem? Recently I read here in the SF issues the developer saying "it's technically impossible for an extension to read the content of a file from the filesystem". This as a negative answer and explanation to the feature request "Indicate that a page has already been saved". Or "it's impossible, from a technical point of view, to open the saved file in a tab. Extensions can't open tabs with file:/// URIs.", answered in "(...) close the (...) tab, then open the file saved by singlefile".

Features it could permit maybe. I was disapointed for I had been hoping certain features in this addon that finally relied on this access that seems to be impossible. Like the 2 just mentionned:

And other ones, like:

Solutions I tried to find (Maybe it's the 4/ that would work). It's too hard for me to believe that an addon can't perform such elementary tasks, like to know the addresses of the pages it has itself saved! So I'd just like to try suggestions, well as I'm zero-level, excuse-me if ideas are too stupid. But who knows. I just searched where or how the URI or the content of a file (URL or last modification date or else) could be accessible to SF.

1/ To access page content/elements: in Firefox "itself"? I just had the idea (??) that a place in the Firefox "body" (interface?) could be reachable by SF to write and read the URLs or else.

2/ To access page content/elements: in a session manager or blacklist addon way? Addons that save browsing sessions, if I understand, they success in saving URLs somewhere "that can be read"? (like MySessions, which choose to make it in Bookmarks but others didn't). But I feel that I'm torally wrong. And maybe also, displaying them is a thing, but comparing them and triggering an action after that is something else. Maybe addons that can cleanup Bookmarks by e.g. removing duplicates can help? A bit the same thing with the blacklist addons. And probably other kinds of addons which can save and read URLs. So it it possible to copy their functionning?

3/ To access page content/elements or for URIs opening: remotely?

4/ To access page content/elements or for URIs opening: with an additional software? Maybe an optional additional script or software (to download separately?), could read the saved pages, and communicate with SF bypassing Firefox? Or e. g. detect a page has been saved and closed by SF and open its saved version. In this case I have the hypothesis (??) that despite an addon can not read the user's "local content", an installed software can do the inverse, that is talk to the addon. If SF is still prevented to communicate with this software because it is installed "locally", maybe the communication between them can be done remotely? (web etc.). Such a functioning could avoid to store lists like in previous proposals, the software could just send a signal "already saved" or "same last modification date". I just saw the existence of the "Companion" program. I didn't try it, but could it be the "external software" I thought of? Later addition: an addon existing that works this way?. I came accros the Local Filesystem Links addon, which seems to have made this, adding a software to access the local files. I have the impression that maybe if this access to filesystem is implemented, it will be like this addon.

Conclusion. Conclusion is......... suspense.... 0! Nothing works thank you. Go to bed now. :/

user0022 commented 3 years ago

Not sure if this too long post was worth it. But I edited it to make it clearer, and I added a link to "Local Filesystem Links" addon in the end, that seems to do what I was talking about!

gildas-lormeau commented 3 years ago

Thank you @user0022, I promise I'll read it conscientiously soon ;)

unphased commented 2 years ago

So, first off, I think the Companion would certainly be able to read the files that it's responsible for saving, being a node script.

I think that means that if Companion-based saving is enabled, then this definitely would be capable of providing the feature of not auto-saving indiscriminately, and as an example preserve only the most recent version (or n-most recent versions) upon re-navigating to the same page. I'll probably implement that at some point so I'll be sure to make a PR for that.

I haven't tested Companion yet, but I've read the description of how it works and I'm looking at the code now... It looks like it's possible to make it do what I want, which would be to get the savepage content generated by the extension running on the actually-being-used browser, sent to the Companion, which will just shove the content into the target file on disk.

To me, the companion is clearly the way to go because it's rather absurd to have to deal with an endless stream of SingleFile page archives in your browser's download history and download bar. I'm not going to just disable the download bar because I want the regular experience of having that bar when a normal file is downloaded.

I'm not sure if it's possible to get companion set up without requiring any browser backend dependencies, but I'll figure that out.

Sorry for hijacking your issue, although my particular concerns do seem related enough to your questions. Now I think that the externalSave approach from the linked code is the behavior that I was worried it would use, it is the less-than-ideal one, where a headless backend is used to fetch the content. That's fine in theory for most sites, but all content that would require cookies or various login state (or whatever other context) to actually load, they wouldn't work in the context of a private headless browser backend instance.

I'll pick this apart more soon once I get into spinning up this automation.

Huge kudos to @gildas-lormeau for this tool and also already having implemented like 95% of the features that I wanted to build for myself. This is going to be a huge game changer.