danny0838 / firefox-scrapbook-maf-creator

GNU Lesser General Public License v2.1
5 stars 5 forks source link

[Req] Improving the "Multiple items MAFF creation" behavior #3

Open pascallothar opened 8 years ago

pascallothar commented 8 years ago

The .maff format is in my humble opinion the best choice for a one-file archiving format for web pages (even though if it is not yet very widespread), because, if you rename its extension in .zip, you will always be able to access it with every browser. And I am sure that with some little scripting, we could ask every browser to read it (creating is perhaps more difficult to achieve than reading). Anyway.

I have analyzed the different features of "Scrapbook X" and of the add-ons "MAF Creator", "File Converter" and even "CopyPageInfo". They are very great, but none of these features give me total satisfaction. But, with some quite little tweaks, they would be awesome.




It would be great also to have the possibility to access the "Properties" (included the "Comments") from the "Output Tree as HTML..." or from its equivalent that is integrated in the .maff (for the moment, only the source is accessible via the green arrows).

These "Properties" are saved in the index.dat of each of the saved items inside de .maff, but are not accessible.

The "Properties" of the Separators, of the Bookmarks and of the Folders (sub-folders) are not saved in the .maff created by right-clicking a folder and "Create MAF", so, especially, the Comments will be lost. Neither are the "Properties" of the Simple Notes.


When you capture the output of "Output Tree as HTML...", the folder_open.png icon doesn't show up, making the indent difficult to follow (EDIT: This bebaviour doesn't appear with "Convert current Scrapbook to .maff").

Also a different color for open_dir and close_dir would make the "reading" better.



Of course, one need to store the properties in an index.dat file to be compatible with legacy Scrapbook. On the other hand .maff format store some properties in an index.rdf file. So I was wondering if, to make things consistent and easy to manage, it would not be a good idea to duplicate the metadata stored in the index.dat of each item in a index.rdf inside each "Scrapbook ID" directory (those named like 20160703235959) with _"private" ScrapbookX xml tags. So, one (or a process) could zip one of those "Scrapbook ID" directories and change its extension from .zip to .maff and have immediately and easily all the metadata and properties saved in it.

If it is possible (probably not) to create inside Scrapbook/data a "Scrapbook ID" directory for each of the Separators, Bookmarks or Folders, could you put an index.rdf inside with the "Properties" (copying them from ScrapBook/scrapbook.rdf). If not possible, could you create a "Scrapbook ID" .rdf (something like 20160703235959_separator_index.rdf, 20160703235959_bookmark_index.rdf, 20160703235959_folder_index.rdf).

Would it make "Mark" and "Lock" for Folders, Bookmarks, Separators, Notes and Note Pages easier to implement (for the moment, it is not implemented)? By the way, would it be difficult to change the bold to, for example, bold italic for the marked items (for readability).

So, by reading those index.rdf (with a proper .xsl), one could access the "Properties" (especially the Comments) from inside the .maff archive without Scrapbook installed (and by unzipping the .maff, even without Firefox). And so, it would be easy to share.

And (I don't know if it would be efficient or not) why not to decentralize the central and global cache.rdf in a bunch of cache.rdf related to each item? Would it be a better way to keep the cache updated? It would, of course, make the .maff bigger, but also searchable with the ScrapBook/search.html file saved inside the .maff (and actually one could choose (for example, by a checkbox) to save or not the cache and the search function inside the .maff).


Which kind of display for the "Properties"? Banner on the top of the right frame or on the top of a new tab? Tooltips? Pop-ups? In the full right frame? In a third frame? What is important is that you would be able to select the text in it, especially a part of the Comment, to copy it in the clipboard.



++++++++++++++++++++++++++++++++++++++++++++



Now I will make you a description of some handling I made, that works even if it is "dirty and ugly". Perhaps, it could give you some ideas, even if I think you don't need me for that :-) .



"Output Tree as HTML..." -> Selecting a Folder -> checking "Output with frame" and "When the process done, open the HTML file", pressing Start.

"Capture as ..." (with "Scripts" checked) the resulting page.

In the Scrapbook X Sidebar, moving the newly captured item in the formerly chosen Folder.

Checking "Store multiple sites in one archive" in "MAF Creator Options ..."

Selecting this Folder in the Sidebar -> "create MAF"

Unzipping the created .maff

Creating directories to_be_ZIPped/ , to_be_ZIPped/data/ , to_be_ZIPped/tree/ , to_be_ZIPped/data_MAFFed/

Moving the content of the directory with biggest "Scrapbook ID" (because it is the date of the last capture, meaning the capture of "Output Tree as HTML...") into to_be_ZIPped/tree/

Moving all the other "Scrapbook ID" directories into to_be_ZIPped/data/

(Following is for experimenting the behaviour of .maff files archived INSIDE an other .maff )

Putting the icons document_properties.png ( document_properties ) and MAFF.png ( maff ) inside to_be_ZIPped/tree/

Copying ScrapBook/tree/folder_open.png and ScrapBook/tree/index.css to to_be_ZIPped/tree/folder_open.png and to_be_ZIPped/tree/index.css

Copying to_be_ZIPped/tree/index.rdf to to_be_ZIPped/index.rdf

Inside to_be_ZIPped/index.rdf, editing the xml tag <MAF:title RDF:resource="blablabla"/>

Creating to_be_ZIPped/index.html with the following content:

<!DOCTYPE html>
<html>
  <head>
    <meta charset="UTF-8">
    <meta http-equiv="content-type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <a href="tree/index.html">
    HTML tree view of<br>
    <i><b>The Folder from which you have chosen to create a MAF</b></i><br>
    saved as "Scrapbook ID" directories<br>
    and also saved as .maff archives,<br>
    with index.dat, index.rdf<br>
    and clickable URL of the Source page.
    </a>
  </body>
</html>

Editing to_be_ZIPped/tree/index_1.html with TextWrangler:

In TextWrangler, "Search" -> "Find..." -> "Grep" checkbox checked

-> Find: file:///Users/pascallothar/Library/Application%20Support/Firefox/Profiles/foo123bar\.default/ScrapBook/data/(\d\d\d\d\d\d\d\d\d\d\d\d\d\d)/index\.html" target="main"(.+)</a> <a -> Replace: ../data/\1/index.html" target="main"\2</a> <a href="../data/\1/index.dat" target="main" title="index.dat"><img src="document_properties.png" alt="" height="16" width="16">index.dat</a> <a href="../data/\1/index.rdf" target="main" title="index.rdf"><img src="document_properties.png" alt="" height="16" width="16">index.rdf</a> <a href="../data_MAFFed/\1.maff" target="main" title="MAFF file"><img src="MAFF.png" alt="" height="16" width="16">.maff</a> <a

Then, Find: file:///Users/pascallothar/Library/Application%20Support/Firefox/Profiles/foo123bar\.default/ScrapBook/data/(\d\d\d\d\d\d\d\d\d\d\d\d\d\d)/index\.html" target="main"(.+)</a> Replace: ../data/\1/index.html" target="main"\2</a> <a href="../data/\1/index.dat" target="main" title="index.dat"><img src="document_properties.png" alt="" height="16" width="16">index.dat</a> <a href="../data/\1/index.rdf" target="main" title="index.rdf"><img src="document_properties.png" alt="" height="16" width="16">index.rdf</a> <a href="../data_MAFFed/\1.maff" target="main" title="MAFF file"><img src="MAFF.png" alt="" height="16" width="16">.maff</a>

Then, Find: <a href="file:///Users/pascallothar/Library/Application%20Support/Firefox/Profiles/foo123bar\.default/ScrapBook/search.html"><img src="search.png" alt="" height="12" width="18"></a> Replace: <a href="file:///Users/pascallothar/Library/Application%20Support/Firefox/Profiles/foo123bar\.default/ScrapBook/search.html"><img src="search.png" alt="" height="12" width="18"></a> <a href="../search.html"><img src="search.png" alt="" height="12" width="18"></a>

"Update Search for Full Text Search", then copying ScrapBook/cache.rdf , ScrapBook/scrapbook.rdf and ScrapBook/search.html to to_be_ZIPped/cache.rdf , to_be_ZIPped/scrapbook.rdf and to_be_ZIPped/search.html (The full cache will be copied, but it is only for test purpose!).



Opening to_be_ZIPped/index.html in a browser, not necessary Firefox.

Now you can search using the magnifying glass on the right. Of course, most of the results will lead to a "file not found" (because it is searching in a copy of the full cache), but the results corresponding to the items actually saved by "Create MAF" will point to the corresponding item in ScrapBook/data/ (not in to_be_ZIPped/data/ that I should have implemented if I were more skillful). (EDIT: I see only now, that you have implemented this in "Convert current Scrapbook to .maff").

Now try to open in new tabs: a item, its index.dat, its index.rdf, its .maff and its Source URL. And then, open them in the right frame (actually the Source URL will open in a new window). You can notice that when you open a .maff link in a new tab, the usual banner (of MAF) will be displayed.


Zipping to_be_ZIPped , renaming to_be_ZIPped.zip to to_be_ZIPped.maff

Opening to_be_ZIPped.maff with Firefox. It will take more time to load, but it works the same.



My "test" doesn't address the problem of the properties of the Bookmarks, Folders, Separators and Notes. And, as I said above, it is "dirty" and "ugly" :-)