RFC - Githubissues

alphapapa / makem.sh

Makefile-like script for linting and testing Emacs Lisp packages

GNU General Public License v3.0

163 stars 13 forks source link

RFC #3

Open alphapapa opened 4 years ago

alphapapa commented 4 years ago

@DamienCassou

Hi Damien,

As you're the author of makel and recently mentioned interest in eldev, may I ask for your feedback on makem.sh as well? It would be valuable to me.

alphapapa commented 4 years ago

BTW, I've written some comparisons between makem.sh and the similar projects, including makel. Please let me know what you think. https://github.com/alphapapa/makem.sh#comparisons

@doublep @rejeep @vermiculus If you are willing, please review the comparisons I wrote and let me know what improvements may need to be made. I aimed to be fair, but I'm not an expert on these other tools. I would also appreciate any other feedback you may have.

doublep commented 4 years ago

Eldev expects (though it does not strictly require) some initialization and configuration for each project before use.

I guess you can drop the "though it does not strictly require" part. While technically correct, as soon as you have at least one dependency, you need to at least tell Eldev which package archive(s) to use.

Eldev appears to use the developer’s local Emacs configuration for installing package dependencies

No, Eldev runs in Emacs completely separated from ~/.emacs and ~/.emacs.d. Dependencies are installed in directory .eldev/EMACS-VERSION/packages inside the project being developed. Eldev has concept of "local dependencies", but those are again unrelated to your "normal Emacs". Those are rather dependencies that are not downloaded from Melpa (or something similar), but loaded from your local working directory, so that you can test changes in project B from project A without committing, pushing, waiting for Melpa to package it, etc.

By the way, when writing Eldev I was only aware of Cask. Now build tools for Elisp suddenly start appearing one after another ;)

DamienCassou commented 4 years ago

You did an awesome job @alphapapa. It seems you have improved the concepts and implementation of makel to the extent of making my project ridiculous :-). I don't think there is any reason to use makel anymore but I haven't used makem.sh so I can't be sure. Really good job. I've just added a paragraph for makem.sh at the end of makel's README.

Something that could be missing is a suite of tests. You could take inspiration from makel test suite.

The next step is for all of us to merge those tools into just one and be happy about it.

alphapapa commented 4 years ago

@doublep Thanks, I made corrections according to your comment. Do those appear to be correct?

@DamienCassou Thanks! I'll look at your test suite and see what I can do.

alphapapa commented 4 years ago

@DamienCassou I've started a branch with a test suite (a test suite for a test suite gets a little confusing at times, haha): https://github.com/alphapapa/makem.sh/blob/feature/tests/tests/tests.sh I'm not sure how comprehensive I'm going to make it at first, but it's designed to make it easy to test makem.sh functions and rules with a minimum of boilerplate. Please let me know what you think, and feel free to contribute some test tests. :)

rejeep commented 4 years ago

Looks fair enough. I'd say that a downside with this approach is that it's bash, which in my experience becomes very complex when you want to do anything a little bit more advanced. Also, since there are no tests, I would not like to contribute because breaking it is too easy.

Cask is intended to be installed by using curl to download a script which is piped to Python...

This is just the easiest way to do it, cloning the repo works just as well and is how I do it.

vermiculus commented 4 years ago

First off, I want to say this is really cool. I'm glad to see more realizations that tools like cask are overpowered – and perhaps a little heavy-handed – for simple testing of elisp (for varying definitions of 'simple').

emake is intended to be installed by using curl to download a script which is piped to a shell, and it appears to make further use of downloading remote shell scripts at runtime, at least for initialization. This is a dangerous, insecure anti-pattern. makem.sh is intended to be copied into place by the package developer, and its code is easy to inspect. No remote code is downloaded, other than installing Emacs package dependencies when requested.

I wouldn't say it's the intended usage :-) I included the local setup script at request; piping it to a file is following suit with tools like Homebrew et al. Of course you are welcome and encouraged to inspect the script before running :-)

The comparison of copying/inspecting/running vs. downloading/[inspecting]/running I think is a matter of choice, not design. Your script could be run without inspection the exact same way – it's up to sane developers to inspect code before running it (esp. to avoid, if nothing else, MITM attacks.)

emake provides some tools for building Emacs versions locally and on CI systems.

This is incorrect, but it's also weird. The Emake repo includes a script that can install emacs for convenience usage on CI, but it definitely should not be used locally. What I consider to be Emake (i.e., the elisp itself, which I know this line is really fuzzy with this project) can't install emacs for you for obvious reasons :-)

It appears that emake may be run locally rather than only on remote systems like Travis CI or GitHub Actions, but that extensive configuration and initialization is required.

You are correct that Emake can be run locally, though I wouldn't say that extensive configuration is required :-) It's probably best to leave comparative sections objective as to avoid connotations altogether. A good measure might be the minimum number of variables required to run (with recommended setup, so that would include things like using Emake's complementary Makefile). I think Emake requires just one/three depending on how you count: EMACS_VERSION, which everyone should already have set IMHO and is used to verify you're testing the correct version of emacs, PACKAGE_BASENAME, which cannot be avoided, and EMAKE_SHA1, which isn't really necessary if you also track emake.el as you would with makem.sh.

With recommended setup (i.e. using the new script and Emake's included Makefile), EMACS_VERSION and PACKAGE_BASENAME are all you need.

emake is a 700-line Emacs Lisp file, with an optional 100-line Makefile that provides some default configuration. makem.sh is about 600 lines of Bash code in one file.

I just want to call out that this is really cool you were able to do this completely with Bash. I wanted to keep things as emacs-y as possible (leaving open the possibility of one day running tests in an emacs session, somehow) but it's cool to see alternate approaches :-) (And it's equally impressive? insane-in-a-fun-way? that you shove so much elisp to emacs via command-line arguments – all without missing parens :-).)

I wanted to point out that a key design decision of Emake is that it was intended to be extended – hence the documentation and the declare forms I included. It allows for creating entirely new types of tests with minimal hacking. (This was necessary for me personally since I need to run what would essentially be considered proprietary tests.)

With all things, there's going to be a balance between convention and configuration. It looks like makem.sh is heavy on the convention side (but then again, so is the new script) and this is not a bad thing – it's simply something to consider when comparing tools. You always want the best tool for your job and I'm looking forward to seeing where makem.sh fits in :-)

vermiculus commented 4 years ago

Oh boy I do write a lot don't I… 😉

noctuid commented 4 years ago

I'm eager to try this out. I'd love to just add this as a submodule to my projects instead of trying to keep my own Makefile updated in every one.

It would be nice if everything elisp-lint does was supported (I don't see support for elisp-lint's check-declare, fill-column, indent-character, etc.). I can make a separate issue if you'd like.

Bash suggestions that may or may not be helpful:

Use #!/usr/bin/env bash for improved portability (otherwise it will not work on NixOS, for example)
Fix shellcheck warnings/errors (or disable where invalid)
the BashFAQ suggestion is to never use getopt; I personally use getopts with some boilerplate to handle longopts (args in regex check are those that do not take an argument)
function was added to bash/zsh for compatibility with ksh's function definition syntax; the bash/posix name() ... syntax is more common; stylistic issue that doesn't really matter

alphapapa commented 4 years ago

@noctuid Thanks for the feedback!

It would be nice if everything elisp-lint does was supported (I don't see support for elisp-lint's check-declare, fill-column, indent-character, etc.). I can make a separate issue if you'd like.

I had not seen that package before! I've added it to https://github.com/alphapapa/emacs-package-dev-handbook

It has a lot of checkers, some of which overlap with others already supported here. I'm not sure about the best way to handle the overlap. What do you think?

Bash suggestions that may or may not be helpful:

Use #!/usr/bin/env bash for improved portability (otherwise it will not work on NixOS, for example)

Done.

Fix shellcheck warnings/errors (or disable where invalid)

I do use shellcheck with Flycheck, and it can be very helpful. Some of its warnings are spurious or a matter of style, so I don't want to fix all of them. I don't know if it's worth disabling specific instances. If you can show me some examples of doing that, I might be willing to.

the BashFAQ suggestion is to never use getopt; I personally use getopts with some boilerplate to handle longopts (args in regex check are those that do not take an argument)

I looked at the FAQ, and AIUI, using getopts means giving up the ability to use long-form arguments, which I'm not willing to do. Of course, it would be possible to write another level of manual argument handling for that, as you showed, but I would prefer not to do that when getopt handles it well.

GNU getopt is part of util-linux, which is an "essential" package on Ubuntu and Debian (as it includes tools like fdisk), so AFAIK it should be present on all standard GNU/Linux systems. Users of other systems should be able to install it easily (e.g. Mac users can install it with Homebrew, as I've documented on some of my other repos, like https://github.com/alphapapa/restic-runner#macos). So unless this proves to be a serious hurdle for users, I'd prefer to use getopt.

function was added to bash/zsh for compatibility with ksh's function definition syntax; the bash/posix name() ... syntax is more common; stylistic issue that doesn't really matter

Yeah, I prefer this style (useless parens are useless). I'm purposely embracing "Bashisms" and avoiding canonical POSIX sh, because they make for more robust and pleasant scripting, and I'm not very concerned about non-free platforms.

alphapapa commented 4 years ago

@rejeep

Looks fair enough. I'd say that a downside with this approach is that it's bash, which in my experience becomes very complex when you want to do anything a little bit more advanced.

It can be, yes. I think I've developed a style (dare I say it, even a somewhat Lispy one) that makes the code fairly simple and pleasant. Python is certainly a cleaner language, but Bash is more stable (i.e. unchanging) and ubiquitous, and we need no additional dependencies, so I think it's a good choice for this purpose. I like being able to drop the Bash script into a repo and run it directly, without having to install tooling external to the package repo. Of course, it doesn't necessarily replace Cask, since it doesn't do everything Cask can. :)

Also, since there are no tests, I would not like to contribute because breaking it is too easy.

Understood. I have a branch that adds some testing, which I'll probably merge soon. I don't plan to write tests for everything at first, since there are so many combinations to account for (@doublep's tests for Eldev are impressive!), but I'll add some to start, and more can be added as needed.

Cask is intended to be installed by using curl to download a script which is piped to Python...

This is just the easiest way to do it, cloning the repo works just as well and is how I do it.

Do you mean cloning the Cask repo into an Elisp package's repo, or into the user's home directory, or...? I went by the installation instructions in the documentation, which I guess most users will use.

Thanks for your review and your work on your other Emacs packages.

alphapapa commented 4 years ago

@vermiculus

The comparison of copying/inspecting/running vs. downloading/[inspecting]/running I think is a matter of choice, not design. Your script could be run without inspection the exact same way – it's up to sane developers to inspect code before running it (esp. to avoid, if nothing else, MITM attacks.)

Of course. I went by the instructions in the documentation, expecting that users will follow them rather than spending the time to figure out if other installation methods might work.

Forgive me if I seem hostile toward the curl | sh pattern, but it really is a dangerous anti-pattern that should never be recommended. For example, it's been shown that hostile servers or MITM attacks can present different content depending on whether a browser or a shell is receiving it, so users who view the contents of a URL in a browser and think it's safe may be deceived. It's always necessary to actually download and save the script and then inspect it, so the curl | sh pattern should never be recommended.

This is incorrect, but it's also weird. The Emake repo includes a script that can install emacs for convenience usage on CI, but it definitely should not be used locally. What I consider to be Emake (i.e., the elisp itself, which I know this line is really fuzzy with this project) can't install emacs for you for obvious reasons :-)

Thanks for the clarification. I'll adjust that comparison shortly...

You are correct that Emake can be run locally, though I wouldn't say that extensive configuration is required :-) It's probably best to leave comparative sections objective as to avoid connotations altogether.

I'm aiming to provide users lightly opinionated guidance toward the best general-purpose solution. They can overlook any adjectives that seem too judgmental. ;)

A good measure might be the minimum number of variables required to run (with recommended setup, so that would include things like using Emake's complementary Makefile).

Right. I tried to get the gist of how many such variables are required from skimming the docs and the code.

With recommended setup (i.e. using the new script and Emake's included Makefile), EMACS_VERSION and PACKAGE_BASENAME are all you need.

IMO I don't think it should be necessary to set boilerplate like that. I've seen no need to set either of those manually so far. At least, not for what makem.sh currently does, which is to run common linters and test suites. Now if you start doing more complex stuff, like testing locally on multiple Emacs versions automatically with persistent "sandboxes" to avoid installing dependencies on every test run, then it might be necessary to do more with the version. However, the GitHub Actions test.yml file in the makem.sh repo doesn't require that, and neither would testing locally in temporary sandboxes.

I just want to call out that this is really cool you were able to do this completely with Bash. I wanted to keep things as emacs-y as possible (leaving open the possibility of one day running tests in an emacs session, somehow) but it's cool to see alternate approaches :-) (And it's equally impressive? insane-in-a-fun-way? that you shove so much elisp to emacs via command-line arguments – all without missing parens :-).)

I guess, technically, it's not all in Bash, since there is some embedded Elisp. :) But, yes, I wanted to keep it all in one file to simplify usage, i.e. I want to be able to copy a single file into my package repo (optionally the Makefile as well, for that calling convention) and run it. And as much as I enjoy Elisp, it would be much more tedious to write all of this in it. I think Bash is a better tool for the job as-is. (Now, if someone were to write some kind of library to make working with shell commands and pipes in Elisp easier...)

I wanted to point out that a key design decision of Emake is that it was intended to be extended – hence the documentation and the declare forms I included. It allows for creating entirely new types of tests with minimal hacking. (This was necessary for me personally since I need to run what would essentially be considered proprietary tests.)

That's a really interesting feature. I looked at it briefly, but I don't understand exactly why you use declare forms. If you're inclined to write more about that, I'd like to learn more.

With all things, there's going to be a balance between convention and configuration. It looks like makem.sh is heavy on the convention side (but then again, so is the new script) and this is not a bad thing – it's simply something to consider when comparing tools. You always want the best tool for your job and I'm looking forward to seeing where makem.sh fits in :-)

Yes, you seem to understand me. I have so many little packages that I'd like to be able to easily lint and test automatically, e.g. in a git pre-push hook and with CI, and having to manually configure lists of files and versions and package names for each of them would be very tedious. For the common cases, that information can be discovered and parsed automatically, so makem.sh is plug-and-play, out-of-the-box, turnkey-ready, and buzzword-compliant. :)

Thanks very much for your feedback!

alphapapa commented 4 years ago

@vermiculus Please let me know if this clarification seems right: https://github.com/alphapapa/makem.sh/commit/2b3018692312c694097541eb67bf75f9534be89c

noctuid commented 4 years ago

It has a lot of checkers, some of which overlap with others already supported here. I'm not sure about the best way to handle the overlap. What do you think?

Not sure, but I think it would be fine to use it for the checkers makem doesn't already have. I guess it depends on how simple the checker is.

Some of its warnings are spurious or a matter of style, so I don't want to fix all of them. I don't know if it's worth disabling specific instances. If you can show me some examples of doing that, I might be willing to.

I think it's best to silence any spurious warnings so you know that any potential issues have been fixed or explicitly marked as invalid. I saw at least one warning that looked like it could be a bug depending on the intended behavior (&& followed by ||, which is different from if then else). As for quoting, I think it's worth quoting variables even if you know that word splitting won't affect them. Shell check is pretty smart about this and won't ask you to quote variables in assigment and [[]] statements where word splitting doesn't happen. It also won't complain if you don't quote a variable that is just a constant string with no whitespace.

You can disable a warning by putting # shellcheck disable=<rule> on the line before it. If you don't ever want to see some warning, you can put that right under the shebang, but I think all or most all of the warnings are useful. See here for more information.

So unless this proves to be a serious hurdle for users, I'd prefer to use getopt.

I don't like the idea of having getopt point to gnu-getopt instead of the default bsd getopt since it could break scripts that expect posix/bsd getopt, but it may not actually cause any problems. The user could still easily create a workaround for that case if it became an issue. Some other things require changes to work outside linux (e.g. bsd sed has no -r), but that could be handled in a similar way (e.g. wrapper script that uses gsed as sed and gnu's getopt as getopt). I think it's fine to leave as-is. A wrapper script to support bsd/osx could be added later if necessary.

I noticed that the current dependency check does not handle something like ((dependency "version") dependency-no-version) (which I've seen some packages use). Maybe use something like this:

gawk 'match($0, /^ *;+ *Package-Requires:/, package_line) {
    gsub(package_line[0], "");
    gsub(/[\(\)]/,"");
    for (i = 1; i <= NF; i++) {
        if ($i !~ /".*"/ && $i != "emacs") {
            print $i
        }
    }
}' files...

Yeah, I prefer this style (useless parens are useless). I'm purposely embracing "Bashisms" and avoiding canonical POSIX sh, because they make for more robust and pleasant scripting, and I'm not very concerned about non-free platforms.

Removing the parens is fine, but function is also unnecessary and not really bashish. Doesn't matter either way though.

I wasn't sure about why if then is sometimes used and && is sometimes used, but that doesn't really matter either.

I only looked over it briefly, but it seems clean, simple, and readable. No massive functions or complex logic. I agree that bash seems like a reasonable choice.

alphapapa commented 4 years ago

It has a lot of checkers, some of which overlap with others already supported here. I'm not sure about the best way to handle the overlap. What do you think?

Not sure, but I think it would be fine to use it for the checkers makem doesn't already have. I guess it depends on how simple the checker is.

I made a WIP branch to add elisp-elint support, but most of the checkers it has are already covered here, so I'm not going to add support for it now.

Some of its warnings are spurious or a matter of style, so I don't want to fix all of them. I don't know if it's worth disabling specific instances. If you can show me some examples of doing that, I might be willing to.

I think it's best to silence any spurious warnings so you know that any potential issues have been fixed or explicitly marked as invalid.

Shellcheck in particular is very strict, so I take its input under advisement. ;)

I saw at least one warning that looked like it could be a bug depending on the intended behavior (&& followed by ||, which is different from if then else).

That's an example of Shellcheck not being able to understand the purpose of the code. Those instances are used correctly, and I don't want to litter the code with shellcheck disable comments everywhere they're used.

As for quoting, I think it's worth quoting variables even if you know that word splitting won't affect them. Shell check is pretty smart about this and won't ask you to quote variables in assigment and [[]] statements where word splitting doesn't happen. It also won't complain if you don't quote a variable that is just a constant string with no whitespace.

I generally agree. If I know that a variable won't contain whitespace, sometimes I leave it unquoted.

So unless this proves to be a serious hurdle for users, I'd prefer to use getopt.

I don't like the idea of having getopt point to gnu-getopt instead of the default bsd getopt since it could break scripts that expect posix/bsd getopt, but it may not actually cause any problems. The user could still easily create a workaround for that case if it became an issue. Some other things require changes to work outside linux (e.g. bsd sed has no -r), but that could be handled in a similar way (e.g. wrapper script that uses gsed as sed and gnu's getopt as getopt). I think it's fine to leave as-is. A wrapper script to support bsd/osx could be added later if necessary.

Yes, a wrapper sounds like a good idea to me.

I noticed that the current dependency check does not handle something like ((dependency "version") dependency-no-version) (which I've seen some packages use). Maybe use something like this:

As far as I know, that syntax is not accepted by package.el; every dependency must be wrapped in parens, even if it doesn't have a declared version. So those packages should be fixed. :)

I wasn't sure about why if then is sometimes used and && is sometimes used, but that doesn't really matter either.

For simple uses, && is simpler, which makes the code more concise. In Lisp terms, it's sort of like (when x y) vs. (if x y nil).

I only looked over it briefly, but it seems clean, simple, and readable. No massive functions or complex logic. I agree that bash seems like a reasonable choice.

Thanks for your feedback!

alphapapa commented 4 years ago

@akirak I just discovered your new https://github.com/akirak/emacs-package-checker while watching Damien's EmacsConf 2019 presentation! I didn't know that you are also working on a similar tool. I would love to have your feedback on makem.sh!

akirak commented 4 years ago

@alphapapa Thank you for adding my software to your list. I am now obsessed with Nix, and the test runner is an experimental project. For now, I am eating my own dog food. @purcell's nix-emacs-ci is also based on Nix, so ideally my package could provide a better integration with CI.

I am watching discussions on makem.sh. It's been actively developed, has a polished interface, and supports extensive backends. Apparently, it's the most practical solution around. I don't have any suggestions for you at present, since it's just superb and I can't compete with you in the same fields. But I hope to have some influence in the future.

noctuid commented 4 years ago

I made a WIP branch to add elisp-elint support, but most of the checkers it has are already covered here, so I'm not going to add support for it now.

indent-lint could be updated to do elisp-lint's indent-character if it doesn't already. fill-column and trailing-whitespace would be nice to have at some point.

That's an example of Shellcheck not being able to understand the purpose of the code. Those instances are used correctly, and I don't want to litter the code with shellcheck disable comments everywhere they're used.

I think from a reader perspective using {} (which I haven't tried but am guessing works and silences the shellcheck warning) would make that more clear.

Personally I'd rather quote variables, make other minor changes, and add some shellcheck comments when necessary than get a hundred flycheck warnings and not be able to tell which ones I've already looked at. When there are a ton of errors/warnings, they are mostly useless noise for me. This also makes automatic checking on CI possible.

As far as I know, that syntax is not accepted by package.el; every dependency must be wrapped in parens, even if it doesn't have a declared version. So those packages should be fixed. :)

Package-lint rejects it, and maybe it is discouraged in documentation eleswhere, but package-read-from-string / package-buffer-info at least handle this case fine. package-read-from-string reads ((foo "0.1") bar) as ((foo "0.1") (bar "0")). Not a big deal for me personally since I don't use this syntax.

Thanks for your feedback!

I've switched general.el over to using makem.sh in a branch. I also switched to using github actions. It works very well, and the output looks great. Thanks for the script! I'm going to eventually switch all my packages over to using it.

purcell commented 4 years ago

package-read-from-string reads ((foo "0.1") bar) as ((foo "0.1") (bar "0")). Not a big deal for me personally since I don't use this syntax.

The issue here is that the package.el in older 24.x Emacsen can't parse this string, which makes the files uninstallable there, even if they appear to satisfy package-read-from-string in newer versions and are otherwise compatible.

That's the reason for the check in package-lint.