ksh93 / ksh

ksh 93u+m: KornShell lives! | Latest release: https://github.com/ksh93/ksh/releases
Eclipse Public License 2.0
184 stars 31 forks source link

The 'r' builtin gets in the way of 'r'. What to do about it? #125

Closed McDutchie closed 4 years ago

McDutchie commented 4 years ago

ksh93 has always had an r default alias and now 93u+m has an r built-in command. They are the equivalent of hist -s, POSIXly also known as fc -s. This repeats either the last command in the history or the one specified with a number.

The problem is, r is the name of a legitimate external utility, part of the "littler" package. It is a scripting and command-line front-end for the GNU R language for statistical computing and graphics.

So we really shouldn't have an r builtin active by default – particularly as it is also active for scripts, where it is effectively useless.

I can see two ways of dealing with this:

  1. Make it a path-bound builtin, like those shown as /opt/ast/bin/* paths in the output of the builtin command. By default, those are only run if /opt/ast/bin is in $PATH, but they can be turned into proper builtins using the builtin command. Doing this is actually trivial – all that is needed is changing "r" to "/opt/ast/bin/r" on this line: https://github.com/ksh93/ksh/blob/77beec1e0dc397420538bffa2262a761cd0351d9/src/cmd/ksh93/data/builtins.c#L98
  2. Remove it. Anyone who wants it can easily set alias r='fc -s' in their .kshrc or .profile script.

Thoughts? Opinions?

posguy99 commented 4 years ago

Removing it is changing classic behavior. And you could have a conflict with almost anything out there in the world, anyway. macOS ships /usr/bin/printf, sounds like another conflict.

According to 17f81ebedb440ed2297097446499a7016d5428e2 'r' and 'history' were only to be available interactively, then 03224ae3af95e90a99b7c6e08af2fa26407c8d1b immediately changed that.

What was the rationale for abandoning the aliases in favor of making everything built-ins, anyway? I read the TODO entry, I didn't see the point to it. Now it sounds like having the aliases is desirable. And I don't see the problem with 'unalias' removing them.

melbit-dannyw commented 4 years ago

I have never used them but I think having them as an alias rather than as a builtin is easier for users to work with, because they can simply add unalias r to their .kshrc, or just type \r. Of course they could also use builtin -d r, but most people are familiar with aliases. As far as I know, they became obsolete once we had fully interactive command line editing with history but some people would probably still use them. I do not use the fc command either as Ctrl-x, Ctrl-e (or just v in vi mode) is easier to type.

I personally wouldn't miss any of them but it's very easy for users to unalias them if they need to.

melbit-dannyw commented 4 years ago

On second thoughts, they are still useful if one chooses not to, or is unable to use interactive editing. Eg. due to a broken terminal.

posguy99 commented 4 years ago

As far as I know, they became obsolete once we had fully interactive command line editing with history but some people would probably still use them. I do not use the fc command either as Ctrl-x, Ctrl-e (or just v in vi mode) is easier to type.

I'm not sure what you mean by "fully interactive command line editing"... ksh has had emacs and vi modes for years. How to retrieve a previous command line varies, and is what's under discussion.

posguy99 commented 4 years ago

Myself, rather than dancing the issue, I'll just say I think it should be put back the way it was. You can remove the aliases if you don't like them, you can override them with established methods. But right now you can't do either.

jghub commented 4 years ago

Myself, rather than dancing the issue, I'll just say I think it should be put back the way it was. You can remove the aliases if you don't like them, you can override them with established methods. But right now you can't do either.

I definitely support this opinion. especially, converting a single-letter alias like r to a builtin is definitely not desirable since it can easily mask some executable of the same name. if a builtin intentionally masks an executable that is basically doing the same thing (only slower) like with printf this is a good thing but quite obviously not so if it is prone to randomly mask some arbitrary executable/command.

McDutchie commented 4 years ago

macOS ships /usr/bin/printf, sounds like another conflict.

It is not, though. Every POSIX compliant OS ships an external printf, it is mandatory. This does not represent a conflict, as the commands are supposed to be compatible. Any standard operation you can do with the external printf you can also do with the ksh builtin; any deviation from that would be a bug (in either the external command or in ksh).

McDutchie commented 4 years ago

Thanks for all your comments so far, everyone.

Just to make one thing clear: The old set of default aliases is not coming back. This is non-negotiable; please don't bother arguing. There are solid technical reasons for using regular built-in commands instead and that opinion doesn't actually come directly from me (though I underwrite it) but is one I picked up from the Austin group (the POSIX technical standards committee). I could spend hours writing a technical treatise detailing all the corner case bugs and other problems with default aliases, but frankly I don't have either the time or the motivation, and besides, it's off-topic for this issue.

This is about whether we want to keep a default r command, and for the purposes of this discussion it really doesn't matter very much whether it's an alias or not; either form would get in the way of an external r that does something completely different (edit: although, granted, an alias has a couple of ways of circumventing it that a builtin command doesn't).

However, if using r as an alias is preferred by the community, then the correct course of action is simply to remove the builtin; everyone who wants r can then trivially add alias r='fc -s' or alias r='hist -s' to their ~/.kshrc and live happily ever after.

"Do nothing and keep the built-in" is of course also an option.

A fourth option is to enable the r built-in for interactive shells only (except if ksh is invoked as sh, which now triggers the POSIX standard mode).

jghub commented 4 years ago

Thanks for all your comments so far, everyone.

Just to make one thing clear: The old default aliases are not coming back. This is non-negotiable; please don't bother arguing. There are solid technical reasons for using regular built-in commands instead and that opinion doesn't actually come directly from me (though I underwrite it) but is one I picked up from the Austin group (the POSIX technical standards committee). I could spend hours writing a technical treatise detailing all the corner case bugs and other problems with default aliases, but frankly I don't have either the time or the motivation, and besides, it's off-topic for this issue.

so you want to remove all default aliases, notably alias command='command '? this simply will cause problems for people relying on established/canonical ksh93 I'd say. sure, one can redefine them but still...

This is about whether we want to keep a default r command, and for the purposes of this discussion it really doesn't matter whether it's an alias or not; either form would get in the way of an external r that does something completely different.

with the important difference that an alias can simply be unaliased. which is a difference that does matter in my view. with a builtin you are "condemned" to rename the external command (silly or not possible in most circumstances) or use full pathnames to call it (totally undesirable).

However, if using r as an alias is preferred by the community, then the correct course of action is simply to remove the builtin; everyone who wants r can then trivially add alias r='fc -s' or alias r='hist -s' to their ~/.kshrc and live happily ever after.

in case the default alias will not be kept (I've got the message that there are "deep" reasons against default aliases but I am not sure whether they apply categorically or only to certain scenarios and whether the r alias would fall in the former or latter category) I opt for removing the new r builtin and notifiy the user of the need to redefine the r alias if he wants it. but I simply would prefer to keep the alias (there! I bothered to argue for it...).

"Do nothing and keep the built-in" is of course also an option.

I would not like this.

A fourth option is to enable the r built-in for interactive shells only (except if ksh is invoked as sh, which now triggers the POSIX standard mode).

I believe r should not be a builtin. way to high probability for a names clash (you already encountered one yourself).

posguy99 commented 4 years ago

On Tue, Sep 1, 2020, at 1:41 PM, jghub wrote:

u want to remove all default aliases, notably alias command='command '? this simply will cause problems for people relying on established/canonical ksh93 I'd say. sure, one can redefine them but still...

That one is already gone.

-- Marc Wilson posguy99@gmail.com

jghub commented 4 years ago

oh really? thanks for the heads up, then. not for me to decide but I find this change unnecessary/problematic from a user perspective.

posguy99 commented 4 years ago

I always wondered about that one, and why dgk thought it was a good idea. Let's have 'command', but let's make it expand aliases? No. I never agreed with that one.

Removing it and saying "oh, users can just put the alias in themselves", just means that there's canonical documentation that is now incorrect, because that's not what the canonical documentation says (Bolsky p32 & O'Reilly appendix B.3). Before you could remove it if the conflict existed and bothered you, now you can't. So it's a problem both ways.

@McDutchie, I don't think what some other application might do or contain is at all the issue. You will always have the potential for a conflict. Who gets to use the 26 letters of the English alphabet for single-letter commands? Why is what R wants to do, of more importance than what ksh wants to do, and thus ksh should get out of the way?

posguy99 commented 4 years ago

Also, from the littler homepage

littler is known to build and run on Linux and OS X. On OS X, you may want to link or copy it to lr as that particular OS thinks R and r are the same. (Hint: They’re not.)

So they're not too concerned about naming themselves (yes, I know they're specifically referring to case-insensitive filesystems).

jghub commented 4 years ago

I always wondered about that one, and why dgk thought it was a good idea. Let's have 'command', but let's make it expand aliases? No. I never agreed with that one.

well, ultimately I do agree with that decision to make the alias expansion work on the "real" command to be executed (second word) rather than on the first word ("command" itself). command is not any builtin command, obviously. like with r simply removing the alias is bad in my view. it will require many people to make sure to include that alias themselves in assorted start up scripts in order not to break their programms.

Removing it and saying "oh, users can just put the alias in themselves", just means that there's canonical documentation that is now incorrect, because that's not what the canonical documentation says (Bolsky p32 & O'Reilly appendix B.3). Before

and other places in that book, notably p.156 were it explicitly advices against modifying/removing the preset aliases for good reasons in my view. although it is not a sacrosanct text of course but it is for the developers of this repo to decide what they do here and clearly @McDutchie seems to have (or share) strong opinions regarding preset aliases. I doubt that those are justified from a purely empirical perspective: ks93 has (and has had) many issues and problems but the preset aliases never caused any that I am aware of or could remember having seen pop up on the old mailing list or elsewhere. so that seems a dogmatic academic position ("default aliases are evil, don't bother to argue with me") to me. but it is what it is..

you could remove it if the conflict existed and bothered you, now you can't. So it's a problem both ways.

+1

@McDutchie, I don't think what some other application might do or contain is at all the issue. You will always have the potential for a conflict. Who gets to use the 26 letters of the English alphabet for single-letter commands? Why is what R wants to do, of more importance than what ksh wants to do, and thus ksh should get out of the way?

I think the relevant point has been made by you and reiterated by me: with aliases the naming collision is a non-problem after it has been recognised since the alias can be modified/renamed. with builtins the naming collision is a problem since it is inconvenient to circumvent (either rename the external program or use full path to it all the time and everywhere). so in my view this strongly indicates that builtins should just not proliferate and "pollute" the name space.

McDutchie commented 4 years ago

I always wondered about that one, and why dgk thought it was a good idea. Let's have 'command', but let's make it expand aliases? No. I never agreed with that one.

[…] like with r simply removing the alias is bad in my view. it will require many people to make sure to include that alias themselves in assorted start up scripts in order not to break their programms.

There was one and only one reason for the command='command ' alias to exist: command must be usable as a prefix to any standard command, and in ksh before 93u+m, a few commands specified by the POSIX standard (fc, hash, times, type) are preset aliases.

Except, this attempt at standards compliance is utterly broken. To see just how broken, try this in non-93u+m ksh:

$ command times
/bin/ksh: syntax error: `{' unexpected

The cause is revealed if we look at how the times alias is defined:

$ alias times
times='{ { time;} 2>&1;}'

Which means command times actually expands to command { { time;} 2>&1;}. Hence the syntax error.

This is also why the whole notion of expanding aliases following command is inherently misguided: aliases may contain arbitrary shell grammar. The purpose of the command command is to run a simple builtin or external command, removing any "special builtin" properties it may have, while bypassing shell functions and aliases. This is fundamentally incompatible with expanding aliases following it. As in, this causes actual real-world breakage that needs fixing.

To fix this, I've now reimplemented all the standard commands as builtins instead, as well as very common ksh commands such as integer and float that no one would ever want to get rid of – and they are so commonly used that they're very unlikely to conflict with external commands as well. This also fixes other things – for example, type=integer; ...; $type foo=bar now works as expected.

So, please note that none of the commands that used to be default aliases have been removed in 93u+m; they are just implemented differently. The reason the KornShell book recommends against removing the default aliases (e.g. by using unalias -a) is that doing this kills a number of essential commands. This is no longer the case on 93u+m. Which fixes another thing: unalias -a now allows users to start over with a clean alias slate, as intended.

And this means that the whole reason for the command='command ' alias to exist is now gone as well.

@jghub, you said/implied that many user scripts would break as a result of removing the command='command ' alias. I would challenge you to find even one pre-existing ksh script that still requires that alias on current ksh 93u+m. (Regression tests and purpose-made hacks do not count, I'm talking actual bona fide user scripts.) I do not believe you will find any.

However, you do make a good argument here:

with aliases the naming collision is a non-problem after it has been recognised since the alias can be modified/renamed. with builtins the naming collision is a problem since it is inconvenient to circumvent (either rename the external program or use full path to it all the time and everywhere). so in my view this strongly indicates that builtins should just not proliferate and "pollute" the name space.

builtin -d does allow you to deactivate builtins, but this argument still holds water. And this is primarily a problem for the r builtin. That's the whole reason I opened this issue in the first place.

So, after giving it time and thought, I've decided to go back to my original idea and reinstate r, as well as history, as default aliases and remove the new builtin versions. These are the two old preset aliases that are only useful on interactive shells, so now they will be preset only on interactive ksh.