caddyserver / caddy

Fast and extensible multi-platform HTTP/1-2-3 web server with automatic HTTPS
https://caddyserver.com
Apache License 2.0
55.42k stars 3.91k forks source link

suggested hide directive #370

Closed areve closed 5 years ago

areve commented 8 years ago

This is my proposal for a directive called hide. For my needs I want to hide folders called .hg and .git and other files and folders starting with a . or ending with ~ . Also I want to hide specific files e.g. SciTEDirectory.properties and the same feature would be able to hide the specific CaddyFile (a TODO in the code). The existing hide []string parameter would be replaced with a hide []HideConfig.

Clearly this would slow the performance slightly, but unless vast numbers of settings were added I would not expect this to be a significant issue. I've never looked at regexp perfomance in go and I'd expect using one regexp would be faster if many comparisons need to be done at once but that's a more complicated performance optimization that could be implemented later.

Here's the doc page I'd propose look like.

hide

hide enables files or directories to be hidden, they will not be browseable if browse is enabled and will return 404 if requested.

Syntax

hide [_matchcase_] {
    prefix prefixes..._
    suffix _suffixes..._
    name _names..._
    path _paths..._
}

This directive can be used multiple times.

Examples

Suppose you want the server to not serve .bak files or files and folders starting with . or _ then you would add.

hide {
    prefix . _
    suffix .bak
}

If you want the server to hide files and folders called Hidden or Secret and only if the case matches exactly you would add.

hide {
    name Hidden Secret
}

If you want to hide a specific paths /docs/private backup and /passwords.txt you would add. _In this example a file or folder called /docs/private would be hidden but a file or folder called /docs/privatefiles would not.

hide {
    path /docs/private /passwords.txt
}
mholt commented 8 years ago

This is a detailed proposal -- thank you for the careful thought and even a suggested docs page. :+1:

The initial question I have is, why would you put files inside your site folder that you don't want on your site?

areve commented 8 years ago

I currently have most of my NAS hosted on the internet using my own nodejs server (AcServer) with an extra plugin script to do the Markdown compiling. I don't maintain a NAS and a web site I just name folders with a dot or no dot depending on whether I want them published on the site.

Caddy has lots of features that I'll never add, but want; plus a larger following and nice website. Also a modified version of the hugo editor would suit my needs too.

mholt commented 8 years ago

I guess that makes sense.

I have some plans to refactor some things in the core file server after 0.8 is released, but after that, I'd definitely be interested in a hide directive. I think this could add a lot of value.

jungle-boogie commented 8 years ago

If it's not obvious, there would be a need to make the hidden directory undownloadable if someone were to do a recursive download with wget or similar.

areve commented 8 years ago

I'm not quite sure what you mean. In my mind my suggestion would mean that if you added .git to the hide names list then all files or folders called .git would not be downloadable or visible to via browse, nor any of the files or sub-folders within the .git folder.

There could of-course be different options to the directive e.g. to only hide the files from browse and not prevent download. I'd suggest this be done as a subsequent feature if someone wants this. I think the name 'hide' is correct one but it there could be a case for a different name e.g. 'block' or 'disallow' instead.

jungle-boogie commented 8 years ago

HI @areve,

but it there could be a case for a different name e.g. 'block' or 'disallow' instead.

I'd prefer those names!

When I say undownloadable, I mean a recursive download of the site: http://www.csoonline.com/article/2137013/network-security/snowden-accused-of-using-hacking-s-greatest-weapon-to-access-nsa-files--wget.html

http://www.wired.com/2013/03/att-hacker-gets-3-years/

Because your hidden/blocked files are on the webserver, they may be downloadable...that's what I mean.

mholt commented 8 years ago

The hide directive would be very simple, all it does:

That's it. That's all it has to do.

jungle-boogie commented 8 years ago

:+1:

areve commented 8 years ago

I've got some code working that does this feature, I just need to add a couple more tests for completeness, possibly change a variable to a pointer and add some comments. I plan to check-in some code in the next few days for you to review.

My1 commented 8 years ago

I like the Idea.

jungle-boogie commented 8 years ago

hi @areve,

How's your testing going?

areve commented 8 years ago

I've not finished and probably wont get much time before January. I've checked in my code so far at https://github.com/areve/caddy please suggest changes. I think the only thing that needs adding still is case sensitive checking, all the settings parsing is done already.

Please suggest improvements, or finish it etc.

mholt commented 7 years ago

Follow #1012 for some work possibly related to this issue.

kaihendry commented 7 years ago

ls for examples hides dotfiles by default on a file listing. Would be nice if browse did the same unless explicitly asked to show all files. I think it's a better approach than hiding. Sane defaults.

tobya commented 7 years ago

I am looking at this.

mholt commented 7 years ago

These are hard questions Toby. :smile:

I don't know the answers to them (yet). But one thought is that perhaps the status middleware (and its package) should register the hide directive as well, because it's basically a "shortcut" directive for something you can do with the status directive.

For Caddy 1.0 or 2.0 (milestone details coming later this week) I think I will be working on a universal, powerful request matching feature for the Caddyfile to help answer some of these questions, at least as far as their implementations go.

hartwork commented 5 years ago

It would be nice if not showing files in listings and blocking direct access to them would be separate.

For more context: I have semi-public files that I don't want to appear in listings but where I still want to share direct links to. With apache and friends that seemed possible and it aches to have to use a different webserver as a second proxied backend where that is needed.

Many thanks!

My1 commented 5 years ago

couldnt you split the public and the semi-public in different folders so that the public folder has browse active and the semi-pub not? or add an indexfile to the semipublic.

hartwork commented 5 years ago

Not really. Either it puts limits on the URLs I can use or it makes the whole setup a lot more complex. With lighttpd something like https://linuximages.de/.semipublic/ was a no-brainer to support and be off listings. (It also displayed README.txt content inline but that's another topic.)

My1 commented 5 years ago

well you could make /public for public and semipublic for semipublic, and either run a website or a redir on the root. for example.

and you could for example show the readme as index with a link (or iframe) to public

hartwork commented 5 years ago

Yes, but those are workarounds. With Lighttpd I needed none of that.

My1 commented 5 years ago

but are these workarounds SO bad? if yes how about you just use lighthttpd? it's not like it's dead.

hartwork commented 5 years ago

I think we're quite off-topic by now. Let's stop here or take it to e-mail. Mine is in the profile.

ghost commented 5 years ago

I am looking at this.

* Do we want a full blown `hide` directive, or simply just hide .files and be done with it?

* Do we want to allow .files to be shown or always hide them?

* Do we want .files to be hidden for all except browse where they can be shown if requested?

* Show it prevent a hidden file from being passed to fastcgi?

I think functionality mirroring or similar to fancyindex_ignore from nginx would be ideal. The original proposal more or less matches this.

mholt commented 5 years ago

Does the status directive not do what is desired here? https://caddyserver.com/docs/status

ghost commented 5 years ago

Not as far as I can tell. status sets an entire folder (or just specific files) and all of its contents to the same status and still leaves the folder/files visible in the directory. _fancyindexignore hides the files/folder from being listed in the browser, but if you know the direct path, it's still 100% accessible.

mholt commented 5 years ago

I see. Then,

  • Do we want a full blown hide directive, or simply just hide .files and be done with it?

Seems like we need a "full-blown" directive then. Assuming a particular naming scheme isn't necessarily awesome in comparison.

  • Do we want to allow .files to be shown or always hide them?

I'd say not hide anything extra by default, that's confusing, and it's not obvious how or even possible to un-hide them.

  • Do we want .files to be hidden for all except browse where they can be shown if requested?

If the hiding is conditional, it should probably be behind authentication instead. "Hiding" is not the right tool for that.

  • Show it prevent a hidden file from being passed to fastcgi?

(Do you mean "should it"?) No, hidden files should be enforced only by directives that necessarily access files.

ghost commented 5 years ago

Sorry, the top of my comment was simply a quote from another person.

If my request is significantly different from the original question + comments here, I can open a separate issue to avoid confusion.

The functionality I'm looking for/missing is mostly a cosmetic one. I know the same thing can probably be accomplished with text/template, but I haven't looked into that just yet as I'm completely new to Go.

mholt commented 5 years ago

Caddy already supports hiding files, internally: https://sourcegraph.com/github.com/mholt/caddy@8369a1211544224b2967dd3ac43372a2ef432291/-/blob/caddyhttp/httpserver/siteconfig.go#L66

The hide directive would just need to add to that list.

ghost commented 5 years ago

Does this support not displaying certain directories with browse? If so, how is that accomplished in the Caddyfile? (I'm sorry if this is already documented somewhere and I appear to be blind.)

mholt commented 5 years ago

The list is "global" meaning any directive that integrates with it will work, no matter what populates the list. Right now I'm not sure that it's explicitly configurable in the Caddyfile. You could search the code base. I know we use it to hide the Caddyfile though.

ghost commented 5 years ago

If this should be a separate issue, just let me know. I'm not sure if we're on the "same page" when talking here, or not.

Let's say this is my Caddyfile:

open.example.com {
  root /var/www/opendirectory
  browse
}

Inside of /var/www/opendirectory let's say we have Folder1, Folder2, and Folder3. As of this moment, browse will list all 3 folders if you went to open.example.com in a browser.

What fancyindex_ignore does with nginx is allow you to set a folder (or files) to be ignored so when the webpage is generated, a certain folder is not listed.

Let's say we had the following:

fancyindex_ignore Folder2

On the open.example.com page, you would then only see Folder1 and Folder3 listed, but if someone manually went to open.example.com/Folder2 they'd still see the contents of that folder.

I was hoping that something like that could be accomplished with a theoretical Caddyfile like so:

open.example.com {
  root /var/www/opendirectory
  browse {
    ignore "Folder2"
  }
}

The idea behind this is to keep folders/files easily accessible if linked to someone, but not openly advertised. It's really just basic cosmetic filtering. I think the same thing could be accomplished using text/template, but I'm not entirely sure yet.

I believe what I'm proposing above is more or less what was requested originally (https://github.com/mholt/caddy/issues/370#issue-119246578), but perhaps not. Also, I hope I've explained it clearly.

mholt commented 5 years ago

Ohhh okay. Yes, you should be able to do that using a custom browse template. This issue is about having Caddy as a whole pretend as if files don't exist.

ghost commented 5 years ago

Ok, so I did misunderstand, my mistake! Thank you for clarifying and sorry for wasting time!

habibalamin commented 5 years ago

I want to do something similar to #314. I want to have clean URLs without allowing dirty URLs to even exist (for SEO purposes and just general tidiness).

Here's the obvious workaround I tried, which didn't work:

  ext .html .xml

  status 404 {
    /index.html
    /about.html
    /feed.xml
    /404.html
  }

  errors {
    404 404.html
  }

It successfully responds with a 404 for the .html URLs, but it also does the same for the non-.html URLs. Using a rewrite rule in place of the ext directive doesn't work either:

rewrite {
  if {path} ends_with .html
  to /404.html
}

This is my usecase for this proposed hide directive. This would mean hiding files while allowing rewrites to those files to work correctly.

habibalamin commented 5 years ago

Oh, man, I got it working:

  ext .html .xml

  rewrite {
    ext .html .xml
    to /404.html
  }

  status 404 {
    /404.html
  }

  errors {
    404 404.html
  }

Clean URLs work, dirty URLs are true 404s.

mholt commented 5 years ago

Nice job, @habibalamin !

This has been implemented in Caddy 2. I think you'll like it. No rewrite hacks required.