Closed OdinVex closed 6 months ago
Without the forward-slash users are told to login again (Gitea destroys the session maybe?).
If I understand correctly, that's by design. In Gitea, the URLs are always considered to equal no matter they have trailing slash or not.
So maybe you could make your reverse proxy to serve http://.../subpath
as http://.../subpath/
? Redirect to the "slashed" path?
Without the forward-slash users are told to login again (Gitea destroys the session maybe?).
I think the "login" (cookie) problem is related to the cookie path. The cookie is only set on "/subpath/" IIRC ......
I think there are some design problems for the "subpath" and "cookie", maybe it needs some improvements ....
After looking into the problem:
{{.Link}}/other-path
. It would cause double-slashes if there is a trailing slash:
//other-path
: it is even worse, because browsers would use it as https://other-path
./subpath//other-path
: not a good URL path ....To completely fix the problem, it needs more designs & improvements ..... not that easy by a simple fix.
Update: I think case 2 is the root problem.
Oops .... I found the root problem now. That's my fault (regression of #24107)
-> Fix incorrect cookie path for AppSubURL #29534
To reclarify, this issue is about pagination, in the template for links generated (First, Previous, #, Next, Last...) on the user dashboard.
Edit: Pagination is not using {{.Link}}/
, it is using {{$.Link}}
without a forward-slash, as I mentioned. I think those need updating.
More details in https://github.com/go-gitea/gitea/pull/29534#issuecomment-1974851486
If you focus on "fixing the trailing slash", I could remove the "fix" operation from that PR, and keep this issue open.
I'm unfamiliar with Go, I'm more of an Assembly/C/C++ kind of guy.
Edit: I took a peek around, I think maybe something in templates/user/dashboard/feeds.tmpl
could address it? {{template "base/paginate" .}}
That period, does that help to generate a link?
Edit: I took a peek around, I think maybe something in
templates/user/dashboard/feeds.tmpl
could address it?{{template "base/paginate" .}}
That period, does that help to generate a link?
Nope. The dot syntax
is a Golang template syntax, it means "pass all current data into the sub template"
resolved by #29534
resolved by #29534
It is not resolved by #29534, that was a different issue (original fixer misunderstood this bug because another was found at the same time.)
Thank you @lunny. :)
original fixer misunderstood this bug because another was found at the same time.
Well, TBH I do not think I misunderstand this issue (actually I won't call the "slash" problem is a "bug").
Maybe you misunderstand the problem. I will quote my explanations and summarize them again:
https://host/SUBPATH
and https://host/SUBPATH/
, the document has said so:
# Note: no trailing slash after either /git or port
href="/SUBPATH?page=2...
should work, it doesn't need an extra slash.https://host/SUBPATH
, https://host/SUBPATH/user
, https://host/SUBPATH/user/repo
https://host/SUBPATH/
, https://host/SUBPATH/user/
, https://host/SUBPATH/user/repo/
I don't see why you wouldn't follow the document and why you insist to add an extra trailing slash for some links, I don't see such a real use-case for end users either.
If you think the trailing slash is a must, please show some real bad cases for no-trailing-slash. Without a real bad case, I couldn't figure out what you mean.
The fact Gitea uses a forward-slash to begin with is enough to follow convention. Calling #29534 a fix is a hackjob following non-standard practice to assume a file is a directory to circumvent fixing the issue.
Edit: The root of the problem has nothing to do with the Cookie issue. That is a side-effect. The issue of parsing comes into play, mind. Some people build parsers and to have to create an exception for mis-generated links in pagination on one specific page just lends to the idea it should be polished and fixed, not to mention web server parsing specs.
Calling #29534 a fix is a hackjob following non-standard practice to assume a file is a directory to circumvent fixing the issue.
As I explained before, Gitea requires to make "/SUBPATH" the same as "/SUBPATH/". Since it is a "proxy path", it is not related to file or directory. There is no file nor directory in this case.
Calling #29534 a fix is a hackjob following non-standard practice to assume a file is a directory to circumvent fixing the issue.
As I explained before, Gitea requires to make "/SUBPATH" the same as "/SUBPATH/". Since it is a "proxy path", it is not related to file or directory. There is no file nor directory in this case.
Maybe not on your instance or Gitea's official instances. I happen to have a binary file named the same. Access to the instance can only be achieved with that slash, as intended by specs, not just my own. Edit: And yes, /SUBPATH/
is a directory, even if virtual. You might be thinking a piece of path rather than directory as in "folder", the term is interchangeable, but it is indeed intended for the server to provide a default ("app") in this case or conventionally index/default.html or return an error, etc.
Edit: To summarize, the issue is a lack of slashes. Your PR does not add slashes, it fixes a different issue (awesome). So when the slashes are added to the user dashboard (the only place this problem occurs) then we'll call it fixed, cause that's this Issue.
https://docs.gitea.com/administration/reverse-proxies#apache-httpd-with-a-sub-path
# Note: no trailing slash after either /git or port
The document was written there to help users configure their instances correctly. If some users don't follow, then that's another problem.
As I explained above, Gitea never adds extra slash to the full path. So if you think it is a problem, then not only "pagination" is affected, a lot of hard-coded links are also affected, then it is not a simple problem, it is a new proposal to totally refactor the Gitea's link system.
Would you like to edit the title to clarify what you mean? For example: Always add an extra slash when the AppSubURL is used without other sub paths
("pagination" is just a tip of the iceberg)
This has nothing to do with how users configure their servers. That is not the issue. The issue is not the variable, it's the links generated specifically ONLY on the user dashboard. Yes, Gitea DOES add forward-slashes, you can grep the code for it all over. Pagination ONLY on the USER DASHBOARD is affected. ONLY ON THE USER DASHBOARD. On other pages pagination works correctly (AND HAS FORWARD-SLASHES) such as the repo page.
Repo Page:
<a class="item" href="/SUBPATH/issues">Issues</a>
<a class="item" href="/SUBPATH/pulls">Pull Requests</a>
<a class="item" href="/SUBPATH/milestones">Milestones</a>
<a class="item" href="/SUBPATH/explore/repos">Explore</a>
USER BLEEPING DASHBOARD:
<div class="center page buttons">
<div class="ui borderless pagination menu">
<a class="disabled item navigation">
<span class="navigation_label">First</span>
</a>
<a class="disabled item navigation">
<span class="navigation_label">Previous</span>
</a>
<a class="active item gt-content-center">1</a>
<a class="item gt-content-center" href="/SUBPATH?page=2&date=">2</a>
<a class=" item navigation" href="/SUBPATH?page=2&date=">
<span class="navigation_label">Next</span>
</a>
<a class=" item navigation" href="/SUBPATH?page=2&date=">
<span class="navigation_label">Last</span>
</a>
</div>
</div>
First and Previous will also be affected, they using {{$.Link}} instead of {{$.Link}}/ as seen on other pages such as ./user/settings/security/webauthn.tmpl
(data-url="{{$.Link}}/webauthn/delete" data-id="{{.ID}}">
). :O A FORWARD-SLASH. If you're re-using the pagination template then it's getting incorrect link information (variable assignment/determination/whatever) from the user dashboard I'd guess.
Yes, Gitea DOES add forward-slashes, you can grep the code for it all over.
I guess I have explained many times ... Gitea never adds extra slash to a full path
:
* so you only see: `https://host/SUBPATH` , `https://host/SUBPATH/user`, `https://host/SUBPATH/user/repo`
* you won't see: `https://host/SUBPATH/` , `https://host/SUBPATH/user/`, `https://host/SUBPATH/user/repo/`
(Update: OK, https://host/SUBPATH/
is quite special in many cases, sometime it has the trailing slash, sometimes it doesn't)
There could be more cases that the SUBPATH misses the correct trailing slash .... for example, I just fixed some: #29535
Anyway, if you believe only the "dashboard" page needs to be fixed, it's fine to me. It could be fixed by this:
-> Add an trailing slash to dashboard links #29555
You could apply this patch to your instance to see whether it satisfy your requirement, I will add some tests later.
I guess I have explained many times ...
Gitea never adds extra slash to a full path
:
Ah, so the issue of understanding was a language one. It does add forward-slashes for any subdirectory but not for root directory (aka subpath), is what you meant. It should, otherwise clients are sending a parameter request to a file (or a cgi script even such as PHP). Separation of routing by paths (not virtual file pathing) is not only best-practices it can be a requirement in some environments due to software collisions. There is no reason to not use a forward-slash except cosmetic when accounting for all that.
There could be more cases that the SUBPATH misses the correct trailing slash .... for example, I just fixed some: #29535
I took a quick look in most places, only the user dashboard had the issue without going too deep into things, I also had to disable Notifications because of an Apache bug with reverse-proxies that prevents multiple threads (ends up blocking new threads because of concurrently held threads) so we wouldn't have noticed the Notifications. We also don't use comments on the instance but rather in Matrix, so we wouldn't have noticed it there either, good finding.
You could apply this patch to your instance to see whether it satisfy your requirement, I will add some tests later.
I will find the time to test it, thank you.
(Update: OK, https://host/SUBPATH/ is quite special in many cases, sometime it has the trailing slash, sometimes it doesn't)
Yes, the inconsistency could accidentally be parsed and routed as a file on some servers (if they are not canonizing /SUBPATH as /SUBPATH/). That was why I was trying to point out the whole "it could be a file" thing, because it genuinely could. If it always has forward-slashes then we know for certain we want the instance, instead of any file or accidental misrouting because the forward-slash would be.
Oh one more thing, since we have a long discussion for this simple problem, I think maybe we could try to improve the communication in the future.
What I understand:
What you proposed if I understand correctly now:
Fortunately, eventually, we could understand each other better. 🙏 In the future if there would be some real cases for a proposal/bug report, I think I could catch you up more quickly. Thank you again for the constructive discussion.
I also had to disable Notifications because of an Apache bug with reverse-proxies that prevents multiple threads (ends up blocking new threads because of concurrently held threads)
Do you mean this? #19265 I also participated in it .... 🤣
...
Some installations must follow standards which do not call for canonize paths the way you're assuming users should. This assumption should not be made. The code and documentation should not have ever tried to canonize the name, that should be a user preference to control that sort of redirection if they wish. I'd support the suggestion, but a requirement would remove preference and control (and even be impossible in some situations).
There are some mentioned RFC standards (could you share the link? I'd like to take a deep look).
I found that information while building a web server back in early 2004 to specifically build a cgi backend specific to an embedded system (an old Surfboard modem), it's been a long time. I'll try to remember the number. It centered around defining URL/URI and what is considered a file versus a directory and the ambiguity leaving it up to servers to define (but since it's an option it should be left to the user if it isn't required by the server and Gitea doesn't functionally need that because links can be written with a slash addressing it). There was something about case-sensitivity being ignored for host and scheme (port, etc) but specifically not for the rest of a URL/URI and an empty URL/URI should be processed as "/" (but that is only for "" requests, eg empty, and /something is not empty). The RFC was from the 1990s but it was still relevant in HTTP1.1 at the time. (Edit: I have not seen anything that updates or changes anything about assuming a path should end in a "/" and not allowing a file to be served for /subpath versus a proxy at /subpath/ because they can be differentiated by the server routing/processing, still is in Apache. I don't use Nginx, if this behavior is no longer the same among web servers then it's either a free-for-all decision or an RFC update, in which case I'd love to know. Still, canonicalizing the names with forward-slash to prevent issues and polishing things would be the best approach I genuinely believe.) Edit: I also remember the RFC mentioning request erroneous implementations of URI parsing resulting in badly formed queries and it gave a rundown of this stuff specifically mentioning interoperability issues because of URI parsing (such as removing /../ and such from URIs but only if they're in the path, etc). I think it also mentioned something about not making assumptions because of client implementations either intending to access a specific URI.
Edit: Although I can't find the original RFC, here's one that specifically mentions the kind of assumption demonstrated by Gitea:
[6.2.4](https://www.rfc-editor.org/rfc/rfc3986#section-6.2.4). Protocol-Based Normalization
Substantial effort to reduce the incidence of false negatives is
often cost-effective for web spiders. Therefore, they implement even
more aggressive techniques in URI comparison. For example, if they
observe that a URI such as
http://example.com/data
redirects to a URI differing only in the trailing slash
http://example.com/data/
they will likely regard the two as equivalent in the future. This
kind of technique is only appropriate when equivalence is clearly
indicated by both the result of accessing the resources and the
common conventions of their scheme's dereference algorithm (in this
case, use of redirection by HTTP origin servers to avoid problems
with relative references).
The language here is implying there is a time it can be appropriate and a time it can be inappropriate. Canonizing to assume a forward-slash instead of allowing for a file by the same name as the subpath proxied for an Instance would be an example of an inappropriate use of technique. For anyone not serving a file Gitea's current behavior would be perfectly reasonable but for others it wouldn't be. Adding a forward-slash of course fixes it for all users in the end, that was why I was trying to communicate it. That RFC isn't the one I remember reading but it does talk about that kind of thing. The section following that, [7.3](https://www.rfc-editor.org/rfc/rfc3986#section-7.3). Back-End Transcoding
also talks about taking special note of filesystem characters (forward-slash among them) but it merely lends to a summary of trying to not make assumptions about filesystems and such. '/subpath' also belongs to '/' whereas '/subpath/' belongs to the sub-router '/subpath', not '/', I can't remember where I read about that though. It was about trying to differentiate routing and how they should be implemented for user understanding.
Edit: I recall the RFC talking about fragmenting the URI and essentially trying to establish "this is a child fragment of this parent" and how to determine that, that document seems to also mention it (but that's a 2014 one, the one I read was from 1990s). "The semantics of a fragment identifier are defined by the set of representations that might result from a retrieval action on the primary resource." A lot of semantics. Canonicalization is fine for those that want it, but if someone wants to serve a file by the same name they should be allowed if it doesn't break Gitea. It can go both ways but supporting the forward-slash removes the ambiguity and fixes the issue for everyone involved. Those not using the forward-slash see nothing different and those using it see it working correctly for them. A bit tired, stayed up too late, bed for me. :)
Do you mean this? https://github.com/go-gitea/gitea/issues/19265 I also participated in it .... 🤣
Yep! :D
Edit: Woke up recalling this might help explain a perfectly plausible situation for needing the differentiation: /gitea.php is a script that handles authentication (users do not see .php, link canonicalization and disabling of .php direct access) and redirects to /gitea/ after logged in (but the redirect specifically is clean so that people can't try to pass parameters to the instance as soon as they login, avoids exploits around links). Other services are maintained and they all need a unified structure for doing this kind of thing across different systems, so while it is specific it also describes a scenario in which ambiguity assumptions can break something. Temporarily we added Redirector (web browser extension) redirects so that anything /subpath? gets redirected to /subpath/? but it's browser-specific until the patch is tested and rolled out. So many interconnected different softwares and backends, x_X.
Awesome, very well done. :)
Description
After logging in (at the user dashboard), users are presented with their commit history and such...but the navigation at the bottom does not use correct links (subpath issue). Everywhere else they appear to be correct, but here is the actual HTML (cleaned up for sake of posting):
There is a missing forward-slash after
SUBPATH
. All other links generated by Gitea appear to be correct (/SUBPATH/
).I believe the issue is in paginate.tmpl, the way links are generated using
{{$.Link}}
instead of{{$.Link}}/
or perhaps{{AppSubUrl}}/
? I don't do Go. All of that paginating uses broken linkage.I do use Forgejo, but the issue is present in Gitea as well.
Gitea Version
1.21.3
Can you reproduce the bug on the Gitea demo site?
No
Log Gist
No response
Screenshots
No response
Git Version
No response
Operating System
No response
How are you running Gitea?
Docker behind a reverse-proxy with a subpath.
Database
MySQL/MariaDB