Add exec shortcode - Githubissues

bep commented 9 years ago

Idea from this post: http://discuss.gohugo.io/t/inline-ditaa-dot-or-plantuml-sources-in-hugo-posts/589/3

Similar to Highlight. but for any binary that writes to stdout.

Two arguments:

Binary
Args
- the inner content to stdin

I don't see any obvious issues why we could/should not add this? Security issues? It's up to the developer to use or abuse this?

bep commented 9 years ago

Come to think of it, it will give theme-authors the opening to do harmful stuff ... like "rm -r /".

But it's an interesting thought - if we could restrict it to a set of commands.

Just tested cat combined with highlight and some go files. Very useful.

{{< highlight go >}}
{{< exec "cat" "/home/bep/dev/go/src/github.com/spf13/hugo/hugolib/shortcodeparser.go" >}}
{{< / highlight >}}

Note, the above will not make any sense without the fix for #797 - but it's kind of cool.

halostatue commented 9 years ago

I don’t think it gives theme authors anything—shortcodes aren’t rendered by templates, only content. Right?

anthonyfok commented 9 years ago

Interesting, and very useful indeed! :-)

Though, for security reasons, I think it should be a configurable option that is turned off by default. It may also be helpful to print out the exec command for each website rebuild so that the end user is fully aware of what is going on. And yes, restriction to a set of commands (and other needed sanitization) would be a good thing.

bep commented 9 years ago

@halostatue no, but shortcodes can be provided by themes. So a shortcode from a theme named coolinnocentshortcode is tempting to use.

But I think this is too useful turn down without some thinking.

EDIT IN: I may be wrong -- if we restrict this to shortcodes only, this will only be available to content files ...

bep commented 9 years ago

But it would be cool if we could make some of the obvious available as both template function and shortcode (if they exist on the computer):

cat
tail
curl
??

spf13 commented 9 years ago

Sorry I'm a bit late to this. When I saw this I immediately thought that it was a security issue as you pointed out in your follow up.

I agree that the best approach would be to define a few commonly used commands.

cat & grep alone would be super powerful. Curl may be but could also pose a security risk.

Another concern is that these won't work for Windows.

bep commented 9 years ago

Yea, thinking a little about this. I foresee a support nightmare. "I get 'broken pipe' with Photoshop".

I might maintain a private patch for myself, but for Hugo it's maybe better to find a Go implementation similar to cat etc.

My current use case is what Martin Fowler calls "live code" in documentation - but much simpler than his setup, i. e. text extracts from source code on disk (I know about Gists).

rodlogic commented 9 years ago

My main motivation for this is so I can plug in tools such as plantuml, ditaa or graphviz to generate diagrams while building the site (Jekyll supports this quite well). AFAIK, this is not possible in Hugo at this point and that a big minus to me.

Although having curl, cat etc as standard shortcodes, it would not address what I am looking for. Either there is a generic exec short code or a specific one for these tools above (here a plugin approach would be best specially since any support questions can be deferred to the developer of the plugin).

bep commented 9 years ago

Go's static linking is mostly a blessing, but it also kicks the feet under the plugin word.

Go has a brilliant and simple exec-package that is perfect for this purpose, but I don't see Hugo pulling in plantuml, ditaa ... as specific shortcodes/functions like Pygments.

Maybe if a general Exec could be secured (whitelist?)

lalloni commented 9 years ago

I've just found this PR while searching for some way of using plantuml from hugo and I must say that this looks almost exactly like what we need. Thanks!

WRT a possible plugin infrastructure in hugo/go, how about a plugin architecture in which plugins are standalone programs which expect arguments from stdin in some standard form (json, protobuf, etc.) and write answers to stdout using the same format (with possible secondary effects in the output fs directory), this plugins should be registered at the project level (not from themes) and invoked from content or templates.

bep commented 9 years ago

@plalloni well, the "plugin architecure" you want is basically a whitelist for the "exec template function", but who approves the whitelist?

There are great powers in Brad F. something's exec package, but as stated, there may be concerns open it up, even to a static site gen.

lalloni commented 9 years ago

@bjornerik Why is a whitelist needed if plugins are not to be registered by theme creators but only by site authors (i.e. a theme could require a certain plugin installed but can not install it by itself, so the site/project owner would be required to install/register the plugin).

spf13 commented 9 years ago

@plalloni There aren't "plugins". These would be methods for the template system (which would call exec) and there's not currently any way of restricting them from some templates and not others. Consequently we would need to keep it clean everywhere.

lalloni commented 9 years ago

@bjornerik Any chance I can get my hands on your patch?

bep commented 9 years ago

@plalloni it's not in a shareable state.

aliafshar commented 9 years ago

Limited range of commands won't work for me. This would be a great hook for running arbitrary external commands to perform transforms on part of the content. I will write a patch for my own use.

halostatue commented 9 years ago

I think that this can still be written in a way as to be safe, using front matter or config.*. If I do:

exec.whitelist = [
  "cat",
  "tac",
  "curl"
]

This is something that the user must configure, and any command where the command does not exactly match a string in the whitelist will fail to run.

aliafshar commented 9 years ago

Here is my patch #847

Nothing fancy, adds an exec function that will be usable in shortcodes I hope. Usable for my purpose, but needs work for any thoughts of inclusion into the main tree.

lalloni commented 9 years ago

@aliafshar Thank you! I'll try it ASAP.

dcorb commented 9 years ago

I would love to see this feature in Hugo. This opens the doors to all kind of external utilities. From basic 'cat' to generating charts with d3js. Something like this: http://www.scottlogic.com/blog/2014/09/15/jekyll-d3js.html

Security is here the biggest concern, but this would be so useful.

As a follow up: Would be possible to cache "exec" outputs, to prevent slowing down the system?

lalloni commented 9 years ago

WRT security: what if allowed executables must be in some specific site level path like /bin (where can't be installed by theme creators)?

Site owner can symlink from bin to anywhere in the system, and/or install specific programs or scripts for calling tools like plantuml o d3.

anthonyfok commented 9 years ago

Or, how about this:

Normally, when hugo is run with no special flags, the exec statements are printed but not executed; The exec statements are only run when the option --unsafe is added?

For added safety measure, even with --unsafe, perhaps Hugo would pause and list all exec statements, asking the user whether they are sure if they want to proceed (y/N)? And perhaps an added flag --yes-i-really-know-what-i-am-doing is needed to skip that question?

aliafshar commented 9 years ago

And, I changed my mind here. I am running an independent preprocessor on the content. I think the main problem here is that hugo will at some point need a way of being extended with plugins that execute code, not just template shortcodes - makes long-term sense for a system like this.

Then security can be addressed by explicitly allowing plugins. I appreciate Golang is going to make this very hard.

lalloni commented 9 years ago

@aliafshar I agree. I've been thinking about "plugins" as executable files/links somewhere in the site directory structure. Hugo could run them as subprocesses and comunicate with them using stdin/out. Could be one-off subprocesses or "server style" processes (for keeping latency as low as possible) attending many requests.

timesking commented 9 years ago

How about this plugin framework, https://github.com/dullgiulio/pingo

spf13 commented 9 years ago

@timesking It's brand new and will require some testing. I think it has a lot of potential.

bep commented 9 years ago

Re. plugin framework, I think @natefinch got it conceptually right, see discussion at http://discuss.gohugo.io/t/using-for-plugins-in-hugo

lalloni commented 9 years ago

@bep ... isn't that essentially what I proposed above?

Also, recently I've been exposed to Terraform's "plugin architecture" (I know that concept is not welcomed around here, sorry) which introduces a specific rpc protocol between parent & plugin processes and could be applicable too. See here.

bep commented 9 years ago

@plalloni you are right, this is exactly what you outlined above. I have seen it mentioned before on the go-nuts group, so the idea isn't sparkling new -- but @natefinch is the first to put it into a well-tested (haven't looked at the coverage, but it looks fine) library.

There is still the question of security, though. If this should be of any value to the common Hugo user, he should be able to wire up his own plugins. He should be allowed to hang himself, but not be hanged by others ... so to speak.

ghostsquad commented 9 years ago

:+1: I would love to see Hugo Plugins.

flyisland commented 9 years ago

I do believe that text based diagram generators like plantuml, ditaa or dot are so powerful for technical content creator, and is a "must-have" feature for any static website generators.

For security concern, I think @halostatue 's idea is great:

This is something that the user must configure

Hugo will not and should not be responsible for any third-party plugin, the end user should download, install, enable and configure the third-party plugin by themselves. But Hugo would let the user know which plugins are about to run, and show the output message of these plugins to the user.

uliska commented 9 years ago

I admit I won't add anything substantial to the discussion but want to put some stress on the opinion that not having the option to call external commands in any way would really be a big minus, if not a show-stopper, for Hugo.

My use-case is similar to others described here: I am looking for a site generator that allows me to apply some custom syntax highlighting that is implemented in a Python script. And I want to include images generated from input code with an external program (namely LilyPond.

I also think that the suggestion above to make external commands available through a whitelist defined on project level would be a good compromise between security and flexibility.

anthonyfok commented 9 years ago

My use-case is similar to others described here: I am looking for a site generator that allows me to apply some custom syntax highlighting that is implemented in a Python script. And I want to include images generated from input code with an external program (namely LilyPond).

So glad to see you here, @uliska! (I am a lurker on LilyPond mailing lists.)

You are right: the ability to use Hugo with LilyPond (and perhaps Gregorio too) would be heaven! :-)

uliska commented 9 years ago

@anthonyfok Actually what I'm investigating is a solution for the documentation of (the "new") openLilyLib.

I found the plugin architecture of GitBook very nice where you can apply arbitrary JavaScript code on blocks of input - and this includes calling shell commands like the custom Python script as you can see here. Here you can see it in action (although in a very much sketchy document). However, GitBook is so much targeted at a "book" style output that I can't use it for the purpose of a general website or even a generated documentation.

abourget commented 8 years ago

Folks, I'd like something like this too, for plantuml graphs..

Where as we on this ? Anything merged ?

bosr commented 8 years ago

:+1:

abourget commented 8 years ago

I'd propose adding a whitelist of fully-qualified command paths that we are limited to passing on the command line or in the site config (with no possibility of override from themes).

ExecWhitelist:

/usr/bin/grep

and the exec call could simply use {{% exec grep %}} or something.. that would be resolved through the user's bin path resolution.

An "exec" shortcode could filter its content through exec.Command() - checking permissions first - using stdin/out.

I want that.. if I write it, can we merge it ? I want plantuml in there..

Please ! :)

aliafshar commented 8 years ago

Sorry, I have to basically strongly disagree with not allowing full shell access. A static site generator is being run by someone that has access to the shell, I don't understand the security rationale here. If anything I would love to be able to have my content as a result of shell pipes. javadoc | htmltidy | injectstyles etc etc. Whitelist is just too limited. This should be a core feature imho.

Note: I don't use Hugo because of this limitation.

moorereason commented 8 years ago

@abourget, I agree with @anthonyfok that we should add some kind of --unsafe command line flag. I'd hate to download somebody's repo to help them on an issue in the forums and there be malicious execs inside.

I go back and forth on the whitelist issue. SysAdmins should be able to modify environments to mitigate this in a service-oriented scenario. Can we use filepath.Glob to allow people to whitelist whole directories? RegExp would probably over-complicate this situation.

bep commented 8 years ago

I have been hesitant about this from the start. @moorereason presents some good arguments. For me this is also a little bit related to the Hugo should delete /public before builds. This seems like a simple task to do right, but it's also a simple task to get horribly wrong in a multi-platform open source program. And even if it says "use as is and no warranties" in the LICENSE -- I'm reluctant to be the one who deletes valuable data on other peoples computers.

abourget commented 8 years ago

I'd be fine with an --unsafe flag.. and it is true that you execute the program on the shell.

However, having a whitelist wouldmake things slightly more robust, and it wouldn't prevent those wanting full shell access to write a simple "do_awesome_pipes_magic.sh" and whitelist that.

Le mer. 24 févr. 2016 14:16, Bjørn Erik Pedersen notifications@github.com a écrit :

I have been hesitant about this from the start. @moorereason https://github.com/moorereason presents some good arguments. For me this is also a little bit related to the Hugo should delete /public before builds. This seems like a simple task to do right, but it's also a simple task to get horribly wrong in a multi-platform open source program. And even if it says "use as is and no warranties" in the LICENSE -- I'm reluctant to be the one who deletes valuable data on other peoples computers.

— Reply to this email directly or view it on GitHub https://github.com/spf13/hugo/issues/796#issuecomment-188413045.

christophermancini commented 8 years ago

:+1: I would love plugin / cli execution functionality.

oasic commented 8 years ago

I would have to agree with the others who have said this is a must have feature. I'm choosing MiddleMan right now for a project because this very important feature is lacking in Hugo.

danbarbarito commented 8 years ago

+1 I would love a feature like this

abourget commented 7 years ago

Do we have agreement on --unsafe-exec for example ? If so, I'll implement it and propose it here. I need that, in the past, now and in the future.

bep commented 7 years ago

Do we have agreement on --unsafe-exec for example

No.

moorereason commented 7 years ago

I know plugins came up earlier in this discussion, but that's a whole different animal. I'm ignoring plugins for now.

My proposal for an exec feature:

Add --unsafe-exec command-line option that defaults to false. No matching site config option.
Add execWhitelist (a slice of file.Match patterns) to the site config.
Add exec template function and shortcode.
Add allowThemeExec bool to site config to allow exec usage in themes. Default is false.
Using exec without --unsafe-exec and a matching execWhitelist entry generates a fatal build error.
Using exec in a theme without --unsafe-exec, a matching execWhitelist entry, and allowThemeExec enabled generates a fatal build error.

Example whitelist config:

execWhitelist = [
    "/opt/mytools/bin/*",  # absolute path
    "curl",                # searches $PATH
    "*",                   # you don't care about security

   "bin/*",                # error here? relative path (contains path separator);
                           # must be bare command or absolute path?
]

My main concerns with exec:

Cloning someone's site to troubleshoot. Answer: don't use --unsafe-exec.
Theme authors can hurt me. Answer: don't use --unsafe-exec. If you want to use exec in your layouts, don't enable allowThemeExec.
Hosting providers could be affected. Providers, don't allow your customers to pass --unsafe-exec in the Hugo build command and/or don't allow them to upload binaries in their site. Secure your environment, otherwise.

abourget commented 7 years ago

Follow up on discussion in https://gitter.im/spf13/hugo today:

use cases (GraphViz/LilyPond/pygments/PlantUML)
security (explicit flag --unsafe-slow-exec, on a tool the admin runs explicitly on the CLI)
forum management (with slow in the flag)
simple implementation, implementing a simple yet flexible protocol (proposed by this Issue implemented by https://github.com/spf13/hugo/issues/847 + some refinements abouts flags.)

Flags in the form of:

--unsafe-slow-exec /usr/bin/exec1,/usr/bin/exec2 or
--unsafe-slow-exec /usr/bin/exec1 --unsafe-slow-exec /bin/exec2

Quite a few people are interested into Hooks or plugins in a more Go-like fashion.. Perhaps we can have another Issue for that, as this one is related to an exec feature.

There has been concerns about shareability of pieces by content authors, but this Issue doesn't address this. It is more for featurefullness.

abourget commented 7 years ago

Perhaps we can think of different ways (config.ini ?) to pass those whitelisted binaries ?

lamvak commented 7 years ago

I applaud the level of concern given to security. I'm sorry for the lengt of the below - I really tried to boil it down. I hope it's still worth your time.

Is there any documentation regarding security model for Hugo? Say, overview of modules, actors, deployment model like target platforms (software meant for local development desktop rather than production server), etc., as well as description of security responsibilities or assumptions for those; i.e. security defaults, or explaining concept of keeping 'serve' module secure by listening only on local interface, etc.) With a general guidelines, definition of such a document could be driven by QnA from devs/contributors/users.

The themes are understood as separate software components, which is evidenced in liberties given to themes developers and users, as well as separate themes licensing. I would propose that in essence the Hugo themes user (person installing them) is ultimately responsible for their secure use. User must either understand the software being installed - or trust the theme development process (i.e. community with peer review) - no innovation here.

Theme author together with theme users are final arbiters of what code/use cases are sensible for extended complex additional theme programming - so far as no one feels entitled to have their code pulled into Hugo. This means I can ship a theme today with a shell script providing some preprocessing to the content files and users may choose to use the theme. While it makes sense that you may decline support for such theme and may wish not to use it yourself, would you go so far as to forbid the practice? If you wouldn't, issue boils down not to having arbitrary code in themes, but to have Hugo responsible for triggering it. If you accept as I propose above that it's ultimately users' responsibility to understand what software is in use, the question shifts to one of balance between securing the use of Hugo on behalf of the users vs flexibility and ease of use for the custom code.

Additionally, a point that relates not only to plugins but the mentioned Hugo features like removing data from public directory: Would it be more productive to solve security issues in the actual security domain? I.e. do not run as priviledged user, do not run on production servers; if feasible - run as dedicated user with limited resources access, or even in chroot-like environment. Of course, this is not necessarily something to be built-in inside of Hugo.

gohugoio / hugo

Add exec shortcode #796