qutebrowser / qutebrowser

A keyboard-driven, vim-like browser based on Python and Qt.
https://www.qutebrowser.org/
GNU General Public License v3.0
9.66k stars 1.01k forks source link

Merge quickmarks and bookmarks / making bookmarks more powerful #882

Open The-Compiler opened 9 years ago

The-Compiler commented 9 years ago

See #840 - the difference between quickmarks and bookmarks also is confusing for users sometimes, so it might make sense to merge them into one. Basically, simply having bookmarks with an optional name/label, and when that is set, they're "quickmarks".

edit: The new command should also make sure the URL actually is an URL.

ghost commented 9 years ago

Please do not :-) QuickMarks go well with more character like DWB. B + QM = very fast. Bookmarks has the problem that it can not be searched. that's the only issues. The also do not need a shortcut. I think, the division as vimperator or dactyl is a good approach

lamarpavel commented 9 years ago

@sdoubleyou I don't understand the problem, if bookmarks and quickmarks would be merged, then the resulting marks would be searchable and could optionally have a keyword just like a quickmark. You wouldn't lose anything.

The-Compiler commented 9 years ago

Just as @lamarpavel said, this is mainly about the way marks are stored on disk - I intend the user-interface to stay the same more or less, except of :open optionally also opening marks as explained in #840.

What I'm a bit worried about is that the format most likely has to change, as I'm not 100% happy with the directory format I proposed for bookmarks, and the <label> <url> (quickmarks) and <url> <title> formats are far too inflexible to add new information...

What about using yaml, just like the session files? This would mean something like this:

- url: http://www.qutebrowser.org/
  title: qutebrowser
  label: git
- url: http://www.example.org/
  title: Example Page
...

And could also be easily changed to allow subfolders later:

- projects:
  - url: http://www.qutebrowser.org/
    title: qutebrowser
    label: git
  - url: http://www.herbstluftwm.org/
    title: herbstluftwm
- url: http://www.example.org/
  title: Example Page
...

However, it clearly loses the "very easy to write by hand, modify by a shell script, etc." property of the old format...

@Emdek also proposed using xbel which is a standard format for bookmarks, but meh... it's XML :wink:

What do you think?

The-Compiler commented 9 years ago

Oh - @antoyo, what's your take on this? I know I was the one who proposed the folder format, but now I'm thinking this would be better in the long run.

antoyo commented 9 years ago

While the YAML format is more flexible, it is not compatible with the format used by dwb for bookmarks. But, the compatibility will probably be broken anyway when the support for folders will be added. And, if needed, we could add a way to import bookmarks from the old (dwb) format. Moreover, we'll need to way to migrate from the old format if we change.

arp242 commented 9 years ago

For what it's worth, I often just edit my quickmarks file directly. I like having it in a separate file for the same reason I like having plain text rather than (for example) sqlite (even though sqlite has some things going for it): it enhances what I call the unixability, that means, editing it without frills with normal unix tools (grep, sed, vi, etc.).

lamarpavel commented 9 years ago

@Carpetsmoker I'm with you on this, especially since qute makes it so easy to bind editing one of the text files to a key (eg spawn $TERM -e 'nvim ~/.config/qutebrowser/quickmarks.conf').

I imagine that when merging quickmarks and bookmarks we would have the same syntax for both in a text file and that former quickmarks are bookmarks with an appendix such as --quick <keyword>. Such a format would still be easily editable in a Unix friendly manner, no?

The-Compiler commented 9 years ago

That would work - but what when you want to add something like labels then? Or have subfolders? Or store a title with quickmarks as well? Having all that in a single line sounds like a very inflexible format...

I'm not sure what to do - I can see both sides. I'd like to be able to easily extend the format in the future without worrying about backwards compability (especially for bookmarks), but I'd also like to have a format with much "unixability" (I like that term!)

lamarpavel commented 9 years ago

Quick idea: Have a folder marks in which there is one text file for each bookmark (with the option for categorizing them in nested subfolders). Each file could have an easily parseable format with one option per line:

~/.config/qutebrowser/marks/qute-repo
url: https://github.com/The-Compiler/qutebrowser
tags: dev,code
key: qb

Here the name of the bookmark would be the file name, url and tags should be clear and key is the quickmark string. Few things are "more Unix" than using the filesystem in clever ways, right? Performance-wise it should be fine since fs-tree is being cached by the OS after first access, searching through cached file system hierarchies is something many Unix tools rely on and support and it allows little room for ambiguity.

Thoughts?

The-Compiler commented 9 years ago

I have to think about this - I had a similar idea in the discussion with @antoyo in #681, but with only one marks file in every directory.

As a bonus, this would even be valid YAML - and if you use tags: [dev, code] it'd even be a real yaml list.

Some points I'll have to think about:

One possibility would be to have a name: ... line in the file, and then only have some kind of slug of the URL (or name?) as filename (plus numbers if there are conflicts), but ignore the filename entirely when reading.

claudehohl commented 9 years ago

How does this scale to thousands of bookmarks? I remember @claudehohl probably has something around this number.

That obviously doesn't scale. Chromium has one Bookmarks file; a json. Load it into ram for full speed ;) And I have ~2700 bookmarks (never delete any, it builds up year after year)... How does Python scale in reading 100k files? Quickly forget about that idea of having one file per bookmark. Bookmark added\ --> minus one inode, 4k storage wasted. Who wants to edit bookmarks via shell script anyway? I prefer the browsing experience, speed and efficiency. For the rest, there is importers and Python scripts.

bb010g commented 9 years ago

Looking at what @claudehohl has, as weird as it can be, SQLite may be our best choice here. It can deal with tons of data and won't kill the file system.

claudehohl commented 9 years ago

(and now everybody): "SQLite! SQLite! SQLite!"

The-Compiler commented 9 years ago

Just had a quick discussion about it in IRC:

20:15 <Emdek> The-Compiler: someone suggested SQLite as storage, I'm currently using it for history and I would like to kill it
20:21 <The-Compiler> Emdek: why so?
20:21 <The-Compiler> (and bb010g did)
20:24 <Emdek> extra dependency, journal file causes issues on some file systems, some users report performance issues (rare)
20:25 <iggy> extra dep?
20:25 <The-Compiler> no extra dependency with python :P
20:26 <Emdek> yes, extra dependency, requires Qt module and driver
20:29 <The-Compiler> https://docs.python.org/3/library/sqlite3.html in my case - but I'd like to keep it plain text anyways
20:42 <bb010g> Emdek: Could you elaborate on the journal file issues?
20:43 <Emdek> sorry, I have beta to release... there was some ticket in issue tracker of Otter where someone was complaining about it
20:44 <bb010g> The-Compiler: I'd like to do YAML too, but scaling flat files is pretty dang hard.
20:45 <The-Compiler> bb010g: I dunno, it works pretty well for 80'000 history entries here
20:47 <Emdek> SQLite needs to load file into memory anyway
tex commented 9 years ago

Please leave bookmarks on disk in format editable in vim :-)

Ambrevar commented 8 years ago

I believe we should stick to a one-liner format. Look at dwb quickmarks: it has a URI, a keyword and a name.

vyp commented 8 years ago

I don't mind YAML, but I'm a -1 on SQLite. plain text bookmarks have the huge advantage (imo) of being able to be tracked via git. And as @Ambrevar says even 10'000 items should not be a problem to search, right?

lamarpavel commented 8 years ago

Having had some time to think about it I agree that one file per mark is not a good way to go about it. For the reasons mentioned several times here I am also against any form of locked in format such as SQL.

YAML or JSON is kind of okay, but I would much prefer formats that are consice and that can be easily parsed with grep, awk, ag etc.

I suggest taking a look at the format of ctags/exuberant tags for inspiration.

The-Compiler commented 8 years ago

I also think either YAML or one-liners are the way to go.

As for the information possibly required:

tex commented 8 years ago

@The-Compiler tags are very important for me, description is handy

bb010g commented 8 years ago

Don't forget shyaml (for YAML) and the excellent jq (for JSON, but pair it with yaml2json or something similar) for command line work.

fiete201 commented 8 years ago

I think yaml looks good. Just to argue about the format @The-Compiler showed where every bookmark got 4 or 5 lines, I think there is no problem to tell awk to start with the second line and show me every 5th line afterwards. e.g the url and sort it or so. Or just print me every fourth and fifth line just to have "url" "tag" again and sort that. The case of needing one liners for unix tools is perhaps not so often than using it inside the browser so this one step more to construct the one liner format should be ok. At least for me.

sudo-nice commented 8 years ago

I need tags too, please keep it.

cwmke commented 8 years ago

A third for keeping tags. When you start to get into hundereds or thousands of bookmarks, it makes they make it very simple to find related items quickly. Folders, I guess, can serve a similar purpose although I find tags tend to be quicker.

antoyo commented 8 years ago

I'm starting to see the needs for tags too, as I get more bookmarks. Also, it would be nice be to able to filter bookmarks by tags in :open with something like:

:open [my-tag1][my-tag2]

to see only bookmarks with some tags.

lucc commented 8 years ago

One question: What are you refering to as tags? I know them from Firefox but don't see something similar beeing mentioned in the docs of QB. As far as I understand it there is only an URL and some text (the title) available for each bookmark.

An idea for the file format: Keep it in a line oriented format. You can also define a strict syntax for it, for example tab delimited fields with tab chars not allowd inside fields to remove the need of quoteing. This would also be extendable for future fields. And it is very unix tools friendly and very similar to system files on unix: /etc/{fs,cron,crypt}tab /etc/{passwd,group,shadow}.

An idea for tags folders and completion: If we make the completion "fuzzy" in the sense that it does not care for the order of input words (so completion on "foo bar" will give the same results as completion on "bar foo") we actually should already have folders and tags implicitly. The idea is that user can edit the "title" in the bookmarks/urls file to include any text they like and think of these titles as paths or tags or titles. I could for example have this urls file:

http://example.com some website
http://homepage.net /private/homepage My homepage
https://github.com/The-Compiler/qutebrowser coding python github webbrowser

For the browser these are all just urls with titles. But if the completion is word-order-insensitive I can think of these as titles, paths/folders or tags however I like.

I would also suggest to allow an argument to bookmark-add in order to specify the title for a bookmark.

Oh, and "Yes" I think we can merge bookmarks and quickmarks,

JonathanReeve commented 7 years ago

@lucc's suggestion almost works, but writing tags in the title field means that you can't use the title field, which can be very useful. Consider the case of vague URLs, where the site title isn't clear from the URL. Something like:

http://fjfjfjf.com/sdfjasdf0998 Now Here's a Very Descriptive Title for the Website

Replacing the title with tags will get you something very unclear--you'll know, for instance, that you're opening a page with the tag foo, but you won't know what page, unless the URL happens to give it away.

One possible solution would be to just store all the bookmarks in tab-delimited lines in the form URL <tab> title <tab> tags, like this:

https://github.com/The-Compiler/qutebrowser     Qutebrowser     coding python github webbrowser

To integrate quickmarks, it'd be as easy as adding a fourth field, which would represent the quickmark name, so it'd be: URL <tab> title <tab> tags <tab> name:

https://github.com/The-Compiler/qutebrowser     Qutebrowser     coding python github webbrowser    qb

This style is fairly backwards-compatible, requires a minimal amount of changes to the codebase, and is very UNIX- and VIM-compatible, IMO.

The-Compiler commented 7 years ago

It's been quite a while, but let's reiterate over whether sqlite would be an option in #2340. Some new completion changes would at least require the history to be sqlite on-disk (as it takes a minute with 250k items to build an in-RAM sqlite database from it), and I'm still not sure on the marks. Please participate there if you want to be heard :wink:

pkillnine commented 7 years ago

Folders for bookmarks could be implemented using actual folders/directories on disk.

fiete201 commented 7 years ago

I am fine with sqlite

The-Compiler commented 7 years ago

We've already had this discussion here (about the general file structure) and #2340 (about sqlite), I'd rather not reiterate over the same things again :wink:

rcorre commented 6 years ago

I've had this on my radar for a while. Now that I've whittled down the completion issues, I might take a stab at it.

Here's what I want:

After reading the discussion again, here's my proposal:

Note that this proposal could support directories as well as tags by supporting multiple bookmark files under a folder. However, I think tags provide a superset of folder functionality, so I'd rather not do this.

Also, I feel like editing tab-separated text files can be a bit awkward, so here are a few alternatives:

A.

example.com/0 [tag1,tag2] This is a title
example.com/1 [] This is a title for an item with no tags

B.

example.com/0 +tag1 +tag2 This is a title
example.com/1 \+This title starts with a + so it is escaped

C.

example.com/0 tag1 tag2 "Title goes in quotes. Tags should all be single words"

Anyone care to tear this apart?

In the meanwhile, I'll work on #1596, because I had to write this whole thing twice :rage:

noctuid commented 6 years ago

To me, the whole point of a quickmark is to be able to open a link with a short, unique tag/key without manual confirmation (#711). Qutebrowser doesn't have this functionality, and since its quickmarks aren't really distinct from bookmarks it makes sense to get rid of quickmarks. However, the quickmark "tag" I described is not the same as a bookmark tag. Even if #711 was implemented for bookmark tags, this would not allow for the quickmark behavior of other vim-like browsers/plugins. For example, if I wanted to be able to open github with 'g, this wouldn't be possible if I had another tag git since g would not be a unique match. Pentadactyl, for example, has a separate concept of quickmarks because its quickmarks are composed of only a url and a single letter/number for opening it with no tags or extra information. I think it's fine to store everything as a bookmark, but I think it would be preferable to be able to store a unique keystring that can be used to open the bookmark in addition to tags.

Edit: To further clarify, these quickmark keystrings would not be completable from :open and would only be usable with :quickmark-load or a specific flag if only :bookmark-load is kept (e.g. :bookmark-load -q).

The-Compiler commented 6 years ago

Sounds good to me overall, but I currently don't have the time to think about it in detail. FWIW I'll be quite busy with exams and stuff until early February, so I won't be able to think things through or review pull requests for this until then.

The-Compiler commented 6 years ago

Oh, as for formats: Option C seems most painless to me, but the question is what more information we want to store there for the future. Since the motivation for the format mostly seems to be tracking things in git, programatically doing edits, and maybe small manual edits once in a while: What about JSON lines, i.e. one line of JSON with arbitrary fields per bookmark?

{"url": "example.com/0", "tags": ["tag1", "tag2"], "title": "Hello World"}

It seems still human-readable and -editable enough, is extensible, and we won't run into funny situations with escaping and all that because we aren't inventing our own format.

Another option would be YAML, but IIRC I've discussed that with people in the past, and it's harder to e.g. easily delete some bookmarks via sed.

rcorre commented 6 years ago

@noctuid:

I think it would be preferable to be able to store a unique keystring that can be used to open the bookmark in addition to tags

From my proposal:

  • :bookmark-load foo opens the first bookmark with the tag foo
    • this allows for "old-style" quickmarks by using a unique tag

So if you tag your github mark with g, then bookmark-load g will open github. The "tags" I'm proposing provide a superset of the quickmark behavior you describe.

noctuid commented 6 years ago

So if you tag your github mark with g, then bookmark-load g will open github. The "tags" I'm proposing provide a superset of the quickmark behavior you describe.

This requires hitting enter; I was specifically referring to tags not being sufficient with #711 implemented.

The-Compiler commented 6 years ago

Why not, though? Something like the :bookmark-load -q you proposed (or a keymode for that or whatever) would still work, and @rcorre doesn't want to get rid of :bookmark-load.

noctuid commented 6 years ago

I have single letter keys for all my quickmarks; these are not meaningful words like tags generally are. If I have any tag that starts with the same letter as one of my quickmark tags, then that single letter tag would not longer be a unique match. I guess there could also be an option to navigate as soon as any tag is fully matched instead of requiring a unique match, but I still don't really like the idea of re-using tags for this purpose since they are not the same. Tags are something I would use for multiple bookmarks, whereas quickmark opening keys are only for a single url. I'd rather each be completely distinct and not show up with the other's command/flag because I'd never want to have both listed at the same time.

Ambrevar commented 6 years ago

I really like where this is going. @rcorre: Thank you for this great proposal.

@The-Compiler: I like the JSON-line idea: while it is a tad little bit harder to hand-write and it makes basically no difference to the eye, it has one very neat feature: it's trivial to parse from any external program program supporting JSON. Conversely, it's a bit harder to scan with conventional UNIX commandline tools.

So overall I have no strong opinion between A, B, C and JSON.

I also dislike YAML for the aforementioned reasons.

ninewise commented 6 years ago

I think you should either go with the JSON, to avoid parsing trouble, or just a trivial format avoiding almost all parsing:

prot://url.tld/    Some title without tabs    tag1    tag 2    tag3

Which is a url, a title and any number of tags seperated by tabs. The only limit is page titles and tags can't contain a tab, which I think they shouldn't anyway.


Also, should we replace quickmark names with non-unique tags, could we use bookmark-add like this:

  1. :bookmark-add adds the current url to the bookmarks. Shows a warning if the url was bookmarked before.
  2. :bookmark-tag tag adds the current url to the bookmarks (if need be) and tags it with the given tag.
  3. :bookmark-tag tag1 tag2 is the same as above, but adds all given tags.
  4. :bookmark-tag -r tag1 removes given tag from the bookmark. The bookmark is not removed.
  5. :bookmark-tag -R tag1 removes given tag from the bookmark, removing the bookmark if it's the last.
  6. :bookmark-tag -u tag1 is the same as (2), but refuses non-unique tags.

Perhaps 4 and 5 ought to be a configuration instead of a flag.

Ambrevar commented 6 years ago

I would not go TAB-separated fields: what we want here is a human-readable/editable format, and it's just too easy to mix up tabs and spaces in a way that won't produce the desired result.

mschilli87 commented 6 years ago

from @rcorre's proposal:

  • :bookmark-load is used when you want to open things by tag
    • :bookmark-load foo opens the first bookmark with the tag foo
    • :bookmark-load foo bar opens the first bookmark with the tags foo AND bar
    • :bookmark-load -t foo bar opens all bookmarks with the tags foo AND bar in tabs

To me, this is counter-intuitive: Having :bookmark-load foo and :bookmark-load foo bar open the 1st hit in the current tab, I would expect :bookmark-load -t foo bar to open the first hit in a new tab (which might actually be something I would want).

How about :bookmark-load --all foo bar instead to open all hits? Ideally, open the 1st in the current tab and all others (if any) in new (background) tabs. Then :bookmark-load -t --all foo bar could open all hits in a new tab each.

lamarpavel commented 6 years ago

I would go with version B but restrict titles to not contain any + symbol, not even escaped. It seems clearer to read and edit by hand than tab-separation and if you simply avoid escape sequences parsing should be trivial. I've been dealing with a similar problem but unfortunately have to account for spaces everywhere, which brings me into escape-hell. So from my experience I would say that since you won't have to deal with spaces in the URL, use spaces as separators and mark tags with a leading symbol like +. Tags should be single-word only, the name or description ist just text without the reserved symbol + and has to be a consecutive string (eg. no tags in between).

One of the great benefits is that you do not need any parser like you do with json, you end up with a file that you can grep trivially (eg. grep "+tag1" | grep -Eo "^[^\ ]" to get all URLs for tag1).

This is also extensible for special "quickmark tags" by introducing an additional symbol like : for which the same rules apply as for +.

Ambrevar commented 6 years ago

I would go for +-prefixed tags as well: easier to read, easier to parse, easier to write.

rcorre commented 6 years ago

I like where this is going, thanks for all the feedback! I'm now leaning towards the + prefix as well. Should bookmark-add strip a leading + or all + from the title? On one hand, only the former is required to correctly parse the file (if tags must come before the title), and preserves as much of the 'natural' title as possible (in case someone knows there is a + in a title and tries to use url-completion with a +). On the other hand, stripping all + would make the file easier to parse by external tools like grep.

lamarpavel commented 6 years ago

I would strip all + from the title, thus making it more prominent to the user that the symbol isn't available for search queries. In case the title is something like +++ NEWS +++, the right decision for the user is to set a title manually anyway.

olmokramer commented 6 years ago

I really dislike the "reserved characters" idea proposed by @lamarpavel, even if it's just a +. What's the gain? Assuming format B, the parser still has to look for the first word that doesn't start with a +, only behave different when it starts with a \, and the bookmark-add only has to check if the first character is a +.

lamarpavel commented 6 years ago

My proposition was not due to performance but clarity and ease of editing. Counter question: Why do you dislike reserved characters so much?

ninewise commented 6 years ago

The gain of reserving + is grep +tag bookmarksfile will get all bookmarks with given tag. No special cases for escaped +'s, + in a title, ...