cdevroe / unmark

An open source to do app for bookmarks.
https://unmark.it
Other
1.6k stars 194 forks source link

When adding a mark through the bookmarklet, a question mark in the title or the url throws the pop-up off #291

Closed hkockerbeck closed 1 month ago

hkockerbeck commented 3 months ago

I'm using a self-hosted copy of Unmark 2020.4. When I add a mark through the bookmarklet, I typically see the popup with the form to add tags, notes etc. But when the title or the url of the mark-to-be contains a question mark, the popup shows a 403 Forbidden error from the Apache server. It looks like the transition from

/mark/add?url=...&title=...&v=1&nowindow=yes&noui=1

to

/mark/info/xxxxxx?bookmarklet=true

isn't working in those cases.

The question mark seems to get successfully urlencoded to %3F, but it still seems to make a difference. When I remove the question mark from the title HTML tag through the browser's dev tools, I can add the mark successfully.

hkockerbeck commented 1 month ago

Some additional information I've come across since: The same problem seems to appear with the url. So if I try to add a mark for the url my.site.tld/something?var=3, the question mark seems to throw Unmark off.

For question marks in the title, I've derived a little hot fix to replace every ? with the corresponding HTML entity, ?. In the bookmarklet's code, I've extended document.title to document.title.replaceAll('?', '?'). So the full modified bookmarklet would be

javascript:(function(){l="https://unmark.henning-kockerbeck.de/mark/add?url="+encodeURIComponent(window.location.href)+"&title="+encodeURIComponent(document.title.replaceAll('?', '?'))+"&v=1&nowindow=yes&";var e=window.open(l+"noui=1","Unmark","location=0,links=0,scrollbars=0,toolbar=0,width=594,height=635")})()

Please notice, though, that this only helps with question mark in the title, not in the url (as described above). One could theoretically modify the url by the same principle. But I would advise against that, as it breaks the url that gets saved to Unmark.

cdevroe commented 1 month ago

@hkockerbeck I'm sorry that you're running into this issue. I haven't actively worked on Unmark in a very long time. Obviously, many URLs and HTML title elements contain question marks - and I've never personally run into this issue. Can you provide a URL that the bookmarklet consistently fails on? I might be able to spend a few minutes figuring out why this is happening for you.

hkockerbeck commented 1 month ago

@cdevroe Thank you for your response. It happens to me (until I included the hot fix described above) with, to my knowledge, every site that has a question mark in the title or the url. Some examples I just collected:

The problem with question marks in the url is especially prevalent on Youtube, because of the /watch?v=... scheme used there.

cdevroe commented 1 month ago

Thank you for the examples. It could be a character encoding issue on your local install. I've saved both of these to the hosted version of Unmark using the default bookmarklet. Something I would check is to be certain that your database is properly encoded for UTF.

image

hkockerbeck commented 1 month ago

Thank you for your answer. I'd consider it relatively unlikely, though, that the problem is related to either Unicode or the database. The question mark has already been a valid character in ASCII in the 1970s (1960s?), so it's hard to imagine for me that it would be creeping up to causing problems now. Also, why just the question mark and no other punctuation like the comma, for example? Additionally, other characters that are clearly "newer to the block" than the question mark, like German umlauts or Japanese Kanji, are working just fine. If an encoding problem is the cause, I'd expect those to fail rather than the good old question mark ;)

Also, I'm not an expert in the request lifecycle in CodeIgniter, but I would expect the database to come into play significantly later than Apache HTTPd. Because the 403 Forbidden comes from Apache (and is not, for example, a CodeIgniter error page), I don't think the database is even involved at that stage.

Fortunately, I just found the cause: It's a security fix in Apache HTTPd from some time ago. According to CVE-2024-38474, rewrites containing the encoded question mark, %3F, can be used by an attacker to reach places and scripts they're not supposed to reach. Therefore,

Some RewriteRules that capture and substitute unsafely will now fail unless rewrite flag "UnsafeAllow3F" is specified.

See also the relevant part of Apache HTTPd's docs.

So I needed to modify the .htaccess to

...
#Checks to see if the user is attempting to access a valid file,
#such as an image or css document, if this isn't true it sends the
#request to index.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?/$1 [L,UnsafeAllow3F]
...

The crucial point is the additional UnsafeAllow3F flag in the RewriteRule. As described above, this may open up a security problem, so you need to consider the details of your setup etc. before deciding whether to do this change or not.

cdevroe commented 1 month ago

@hkockerbeck Thank you for providing the potential work around. I'm closing this issue - but if anyone comes along in the future and has a better solution, please let us know.