jarun / buku

:bookmark: Personal mini-web in text
GNU General Public License v3.0
6.52k stars 294 forks source link

Multiple URLs per bookmark / Alternative URLs for bookmarks #720

Open piegamesde opened 8 months ago

piegamesde commented 8 months ago

Sometimes an internet resource is not unique and may be accessed through different ways. Especially decentralized things tend to have multiple mirrors or entry points. This could also be used to provide links to archived versions of web pages.

I know I could always put the links into the free-form description field*, but I'd like to explore whether more structured approaches with this feature as a first-class citizen would be possible.

One question that would need to be decided in that context is whether the other URLs should be of same importance or instead having one "primary" URL with "mirrors". This decision probably also depends on the details of the database format and backwards compatibility considerations.

* hacky implementation: buku --nostdin --print --json | jq --raw-output '.[] | [ .uri, (.description | split("\r\n") | .[] | scan("^mirror: (.*)$")) ] | flatten | .[]' This will print you all normal URIs plus any description lines starting with "mirror: ".

LeXofLeviafan commented 8 months ago

One question that would need to be decided in that context is whether the other URLs should be of same importance or instead having one "primary" URL with "mirrors". This decision probably also depends on the details of the database format and backwards compatibility considerations.

Buku uses SQLite database (and therefore a fixed set of columns for storing data), with uniqueness constraint on the url column; furthermore, it rearranges records on deletion, making URLs the only reliable way of identifying records (…as long as you do not explicitly modify the URLs, that is). Therefore, having a single record with multiple "primary" (same-importance) URLs is not really an option.


…I had some thoughts to implement support for “extending” records by adding customizable extra fields, but those would be stored in a separate table (likely requiring another table for schemas) and I can't promise they'll be quite as easily accessible as the regular (“static“) fields (meaning, no guarantees that these fields would be searchable or that such a search would be performant). Though at the level of printing out existing records, it should still be possible :thinking:

For the time being, for personal use, your hacky workaround should work just fine I'd say. Though it's certainly more convenient to place it into a script within $PATH.

sjehuda commented 6 months ago

in such occations, I set the same tags.

1. Arav's dwelling / Article / How to move a root from SD card to external drive on Raspberry Pi [152317]             
   > http://arav.i2p/stuff/article/rpi_root_on_external_drive                                                         
   # tutorial:partition,tutorial:raspberry-pi,tutorial:usb                                                            

2. Arav's dwelling / Article / How to move a root from SD card to external drive on Raspberry Pi [152332]             
   > https://arav.top/stuff/article/rpi_root_on_external_drive                                                        
   # tutorial:partition,tutorial:raspberry-pi,tutorial:usb                                                            

3. Arav's dwelling / Article / How to move a root from SD card to external drive on Raspberry Pi [161690]                                                                                                  
   > https://5.227.208.129/stuff/article/rpi_root_on_external_drive                                                   
   # tutorial:partition,tutorial:raspberry-pi,tutorial:usb                                                                          

Buku uses SQLite database (and therefore a fixed set of columns for storing data), with uniqueness constraint on the url column;

You can assign a table for URL which would be connected to a given ID of the entries tables.

Then again, the representation of an entry is URL, so doing this might puzzle people when they would want to understand the buku database, and then again this can be covered in buku documentation.

In one of my syndication projects I assign a table for links entries_properties_links which each link is connected to its respective entry at entries_properties; same for tables entries_properties_authors and entries_properties_contents, because these properties might have multiple values, yet the main link, title, type and other properties that are unique, are placed in entries_properties.

@piegamesde if your proposal is implemented, then what data would be the unique data for an entry?

The only unique data that I think of would be the entry ID which is set by buku.


In XMPP, there is XEP-0209: Metacontacts which regards to a case which an XMPP contacts might have an Email, IRC, TEL (whoever still uses telephony), and even XMPP (in case a contact holds more XMPP accounts) as additional contact information.

So we might learn of a fashion of doing this by looking into XEP-0209.

LeXofLeviafan commented 6 months ago

@sjehuda As I said, the URL is the only reliable identifier of a record here. The "ID" is nothing more than an incidental ordering index (which is not an essential part of the data and can change at any time).

For all intents and purposes, URL is the identifier of a record here.

sjehuda commented 6 months ago

I agree.

piegamesde commented 6 months ago

@piegamesde if your proposal is implemented, then what data would be the unique data for an entry?

After learning how the database works, I'd propose keeping one URL as primary key and the others separately as "mirrors".

sjehuda commented 3 months ago

We might want to imitate The "rel" Attribute of The Atom Syndication Format.

4.2.7.2.  The "rel" Attribute

   atom:link elements MAY have a "rel" attribute that indicates the link
   relation type.  If the "rel" attribute is not present, the link
   element MUST be interpreted as if the link relation type is
   "alternate".

   The value of "rel" MUST be a string that is non-empty and matches
   either the "isegment-nz-nc" or the "IRI" production in [RFC3987].
   Note that use of a relative reference other than a simple name is not
   allowed.  If a name is given, implementations MUST consider the link
   relation type equivalent to the same name registered within the IANA