danny0838 / firefox-scrapbook

ScrapBook X – a legacy Firefox add-on that captures web pages to local device for future retrieval, organization, annotation, and edit.
Mozilla Public License 2.0
323 stars 65 forks source link

[Req] (kind of) symbolic link from an item or sub-tree to the filesystem-tree #138

Open pascallothar opened 8 years ago

pascallothar commented 8 years ago

IMHO, this enhancement is the most important. It could propulse ScrapBook X to the Zenith :-)


How could I explain the context and the philosophy? Hmmm! Following is the story! Perhaps a bit long, but I didn't find a better way to explain clearly.


As many people, most of my documents are coming from the Web (I don't say Internet, but more precisely the Web, i.e. http//), so ScrapBook X is the ad hoc tool for me. But, before coming across Scrapbook X, I was using multi-file web page archives (foobar_files directory + foobar.html file), then later mono-file archives (.war of Konqueror, .webarchive of Safari, .mht and later the .maff of Firefox).

How to avoid to have two systems and two locations where my documents and non-documents are classified?

Separating files by types (Documents, Images, Music, Video, etc) or by the tool I used to save them (ScrapBook X, wget, Heritrix or other tools) is not an option for me. I want to classify them by topics.

It would not be easy to transfer the big amount of my previous gathered documents to the ScrapBook X tree. Furthermore, they are not all HTML. Lots are PDF, .doc, .djvu, pictures, etc; I don't even see a direct way to put those types in the Scrapbook tree (I could create a dummy item, change the content of its directory, edit index.html, index.dat and scrapbook.rdf, but imagine!!). Others are .exe, .zip, .mpeg, etc. So, it is not imaginable to transfer the filesystem-tree to the Scrapbook tree. Actually, Scrapbook was not created for this purpose.

So I have to do the other way: having everything stored and classified on the filesystem-tree. I thought to transfer sub-trees of documents gathered by Scrapbook X to the filesystem-tree. I thought to use "MAF create" to do that. But there are some limitations and problems (see [https://github.com/danny0838/firefox-scrapbook-maf-creator/issues/3]). And, also, there is a problem when you want to add a little file to an already MAFFed bunch of items or to edit one of the items inside this bunch.

So came the idea of a kind of sym-link. I thought to something based on the output of "Copy Page Info". But after a while, I saw that the solution was ... already existing: the "HTML tree"! So obvious that I didn't see it!!!

No more need of archiving the items (and deleting them afterwards from ScrapBook to not take the double of gigabytes). Furthermore, you can move around the 'sym-link', but ALSO the target. Because, even if you move around the items, their ScrapBook-ID will not change and, because the directory containing the item is named following the ScrapBook-ID, the resources will not move. If you add some items, you will likely find them in a sub-tree build of items already indexed in the 'sym-link' ("HTML tree" capture). So there are lots of advantages.

But the process is tedious:

For a standard item which is single, you need to "Convert it to Folder" before. And for this single item ... it SEEMS even more tedious than if you go through the same process for a big sub-tree.

But the result is quite awesome. When you open the 'sym-link' (say _"FooBar - ScrapBooklink.maff" or "FooBar--SB-lnk.maff"), you have not only access to the root(s) of the sub-tree(s) (as for a simpler 'sym-link'), but ALSO to all the items (leaves) that were inside it at the moment of the capture. Now, if you want to make some work on an item (editing, highlighting, annotating, etc), you can open it from the left frame in a NEW TAB. After having "Located" it, you are able, from the ScrapBook X Sidebar, to access its "Properties" (and the "Comments"); but ALSO to the items that were ADDED to the sub-tree(s) AFTER the time of the capture.


To make this process less tedious, could you implement some enhancements?

The simplest would be to add a check-box "Create a SB-link" in the Folder picker window (which shows up when clicking "HTML tree ...") that would trigger a transparent 'piping' to the process normally used by "Create MAF".

And also a text field to pass a custom "Title" argument to "Create MAF".

Adding the possibility to choose the output directory would be even better.

If you want to enhance even more, you could add an entry "Create HTML-tree or SB-link" in the contextual menu of the "Manage" window (and of the Sidebar in "Manage" mode). So, instead of picking the Folders from the usual dialog of "HTML tree ...", one could select the Items, Folders, Non-Folder-type-Folders, Note-Pages, Bookmarks, even Separators, etc in the "Manage" window or Sidebar, then Right-Click and choose "Create HTML-tree or SB-link" that would open a dialog with a text field to pass the custom "Title" argument (and possibly the output directory) to the process, plus a radio-button to choose between "Create HTML-tree" and "Create SB-link". (IMHO, it would be possible to do this by reusing (and rewriting a little bit of course) the code of the function which is triggered by the entry "Tools" -> "Move..." of the contextual menu on a SELECTION of Items, Folders, etc of the Sidebar in "Manage" mode. Also, the code of the 'partially' disabled "Copy...".)


An other possibility that would be more portable (because .html "single file" is portable, .maff much less) is to replace the line:

in the tedious process described above by the following lines:

jdunn0 commented 8 years ago

I have read your post a few times and I'm not quite sure what you are talking about. There are two things it appears it might be:

  1. Creating a tree.maff file that just contains the ScrapBook tree only and it links to the ScrapBook items in the file system. I have no idea why you would need this as you could just use the existing tree html file the same way.
  2. Creating an everything.maff file that contains the tree html file along with all the ScrapBook items. I'm not sure why you would need this either as the ScrapBook being in a folder instead of one file works just fine.

Are either of those what you are suggesting here?

pascallothar commented 8 years ago

@jdunn0 ,

It seems that my post was not as clear as I thought :-) . At least for you. But, then, probably also for other people. I am French speaker, so it could be a part of the reason.


I propose you, first, to follow the 'process' I have described, so you can see its result. It will help the understanding. So you will understand that it cannot be 2°). And actually 2°) is already implemented by:

Why is that useful? For example, for backing up. Or for people who want to share with people who have not ScrapBook installed. Or ... ? I don't know the real reasons that made the developers implement this feature.


  1. Creating a tree.maff file that just contains the ScrapBook tree only and it links to the ScrapBook items in the file system.

Actually, 'everything' is in a file-system, would it be on a disk filesystem, a virtual filesystem or even a network filesystem, etc. Perhaps could you advice me a better word for 'filesystem-TREE'.

What I understand by 'filesystem-TREE' (vs 'ScrapBook-TREE') is the usual hierarchy. For example inside /users/, you will have among others /users/jdunn0/ which will contain /users/jdunn0/MAIN_classification/, which will contain among others the 'DIRECTORIES' /users/jdunn0/MAIN_classification/maths/ and /users/jdunn0/MAIN_classification/informatics/ which will contain /users/jdunn0/MAIN_classification/informatics/C++/ (containing executables, documentation in .doc, .html and .pdf), /users/jdunn0/MAIN_classification/informatics/about_Firefox/add-ons/ScrapBook_X/ which will contain documentation in .pdf and .html, but also .xpi you want to keep for a reason or and other.

In the ScrapBook X Sidebar, you have the 'FOLDER' KnowledgeInformaticsProgrammationC++ containing plenty of 'Standard Items', Separators, Bookmarks, Notes, but also sub-Folders. It is what I named a 'sub-tree'.

When you go through your 'filesystem-TREE' and you are in /users/jdunn0/MAIN_classification/informatics/C++/, you would like to have 'something' referencing the 'sub-tree' C++ (KnowledgeInformaticsProgrammationC++) that is in your 'ScrapBook-TREE'. This is what I named '(kind of) sym-link'.

I have no idea why you would need this as you could just use the existing tree html file the same way.

While writing, ... at this precise moment ... , I feel suddenly stupide :-( . Every time it happens, I am distressed by the fact that it can happen to me :-) .

Since I came across ScrapBook X, I have almost forgot that I can directly save a page with the usual "Save As ..." of Firefox (vs the ScrapBook X "Capture As ...") !!! Yes, you are right, I can save the Output of "HTML tree" with the Firefox add-on "Mozilla Archive Format, with MHT and Faithful Save" [https://addons.mozilla.org/fr/firefox/addon/mozilla-archive-format/]. Making then my request less pertinent!!!

But, anyway, the result is not optimal, because this add-on disables javascript, making the Folders non expandable/collapsable. So you need to expand the whole "HTML tree" before saving it. Not very bad, but not very good.

And also, the .maff is not portable. So the end of my post stays pertinent.

danny0838 commented 8 years ago

When output using frame mode, and you have selecteed an item to show it at the right frame, the hash of the item is appended to the page URL. You can use the URL to link to the frame with the page opened directly. You can create a link in the filesystem using that URL to link the specific SB item.

On the other hand, to import files from the OS to ScrapBook, you can the "convert HTML and files to ScrapBook data" mode in the SBX Converter.

pascallothar commented 8 years ago

It is not what I was talking about.

danny0838 commented 8 years ago

What is the difference?