Advice on how best to use Elvish for development and infrastructure automation

rdw20170120 commented 2 years ago

I have a project that I have built over the last couple of decades, and that I have used to support many consulting clients. It is a framework to support and empower automation for software development and infrastructure automation. The target audiences are site reliability engineers (SREs), DevOps (and DevSecOps) engineers, and developers in general. My hope is to address a massive opportunity that I perceive regarding the techniques, the tools, and the examples that currently exist.

I had previously built my project using Bash. In so doing, I made the same "default" scripting choice that limits and cripples most projects. My project's scripting shell is a particularly crucial tool choice that will make-or-break my project's potential success. So now I am redesigning and rewriting my project, starting with the choice of a scripting shell that can fulfill my project's ultimate purpose.

I have considered and rejected a myriad of other shell candidates, based on essential principles that I have gleaned. I have spent a few months trying Fish, and found that it is not suitable for several reasons. I spent a few weeks trying Xonsh, and found that its slow performance is unacceptable. I am now trying Elvish, and it seems very promising.

I have asked some questions here and there in the Elvish project's issues, and the community has been kind enough to indulge me and to engage with me. However, it was suggested that I should open an issue that better collects and focuses my questions into a single discussion--one that is less likely to be perceived as polluting or distracting from other issues. I accept that suggestion, I offer my apologies for any harm I may have caused, and I ask for the community's advice on my project.

I have a little over four decades of experience and exposure as a software and system engineer, architect, database administrator, technical consultant, etc.--I have pretty much worn all the hats. Lately, I focus on DevSecOps. I run my own consulting business that provides crucial development, security, and operational support to enterprise-class clients around the world.

I strongly prefer an approach centered on the Unix philosophy. In that realm there are numerous tools offered and MANY examples. The tools are (rightly) focused on a particular portion of the overall solution to automating the software development lifecycle (SDLC) all the way through to deployment and operations. As a necessary consequence, someone must write the "glue" that connects all those tools together. Ideally, that "glue" logic should bind those tools into a seamless pipeline for transforming the ideas in our heads into the technology that runs our world. That pipeline should do so using the best practices that we are able to design, with all of the functional and nonfunctional requirements that matter.

The massive opportunity that I perceive is to provide a very solid example that covers the full scope of choosing good tools, designing the best practices, and implementing it all objectively WELL. In other words, I observe that the existing examples are too specialized towards a particular part of the overall challenge, while I believe that we desperately need a generalized example that constitutes a full solution.

As we all know, most examples of tool and technique use are purposefully abbreviated to show just the point of discussion. In many cases, the author even says that explicitly. That is all good. As an industry, though, it is obvious that individuals and organizations often fail (sometimes catastrophically) at imitating, collecting, and assembling those trivialized examples into a coherent whole. My heart breaks for the people caught in such situations. A few examples go much further, often to the point of building an entire application. Sometimes those examples even go so far as to deploy and operate an example application or system, especially in our modern age of easy virtualization using the cloud and/or containers.

Still, in essentially all cases, the examples are held back. The authors typically explain that their code is purposefully not production-grade or enterprise-class because making it so would be too distracting or overwhelming. I know of a couple of examples that go much further, but they are commercial offerings with significant price tags. Again, that is all good. Nonetheless, I feel that our industry will greatly benefit from an example that goes all-the-way and is readily available to all.

Hence, my project is not about writing another "tool" per se. My target is an open-source project that IS that example. It is a single source-controlled repository (monorepo using trunk-based development) that WELL demonstrates Continuous Deployment of a production-grade, enterprise-class technology solution based on a realistic set of best practices, good tool choices, and solid engineering.

People would "use" my project by forking (or copying) it as the template for their own projects, then customizing it with their own tool choices, their own application source code, and their own glue logic. So I have no worries about my choices colliding with theirs, because the projects will already be theirs. They can and will change anything and everything to suit themselves, and I intend to fully encourage and empower their choices. My challenge is to give them enough of a start to be worthy of them forking my project, and then enabling their own choices enough that they do not feel forced to abandon the effort in favor of something else.

I am NOT advertising. I am NOT asking for contributors. I do NOT (yet) even have code worthy of public disclosure--it is too much in flux (i.e., broken as I figure it all out). I am seeking advice on whether Elvish is the proper foundation for such an endeavor. Can Elvish be the language of that essential "glue" that turns a bunch of individual tools into a seamless Continuous Deployment pipeline? Will the Elvish community be willing to embrace an audience that is attempting to build and support such pipelines across workstations, VMs, containers, and clouds around the world?

If not Elvish, then which shell?

rdw20170120 commented 2 years ago

In the discussion of https://github.com/elves/elvish/issues/974, @krader1961 responded to some comments that I had made. I have moved the discussion here as requested.

I missed that one link about the Elvish package manager. If EPM is released and is useful, why is it not featured in the tutorial? Even a Google search turns up almost nothing about it. I did not even get a hit for that page. That seems quite unfortunate for users.

I mentioned Python in response to @zzamboni's comments. He used Python as an example of module search behavior. I clarified Python's behavior and how it contrasts with Elvish. Of course Elvish can define its own module search behavior. How that module search behavior is defined, and how well it is communicated, will both have significant impacts on the user experience. I am simply advocating for a better user experience.

I must disagree. Only Elvish will translate use ./activate into a reference to the relative path ./activate.elv. No other tool or language will do that, because it is not a relative path. Dropping the last few characters of the file name is simply not in the definition of a relative path. Nothing I encountered in the documentation and presentation of Elvish told me that I should expect to write use ./activate in order to get Elvish to use my module script at ./activate.elv. Likewise, I encountered nothing that told me that Elvish officially uses .elv as a common file suffix. I certainly never encountered anything that told me that Elvish requires the .elv file suffix.

I literally spent days troubleshooting this issue before I got it to work, and I discovered how to make it work by digging through @zzamboni's personal dot files project (which he was kind enough to publish publicly). This is not a good user experience. If I were to make Elvish an essential part of my own project--telling the user that they must start out with use ./activate--I would expect to describe this Elvish convention to my users so that they do not likewise suffer from this bad user experience.

@krader1961 makes a crucial point about Elvish modules. Elvish loads a module upon encountering the first use statement that references it. It is inappropriate for me to refer to this as "cacheing". While cacheing is an important design factor, a crucial distinction is that cacheing MUST be transparent. The system must still function correctly whether an item has been cached or not, whether it expires out of the cache, and whether it never gets loaded into the cache. Those characteristics do not fit an Elvish module. It is essential to Elvish that it load a module and execute its code once-and-only-once, regardless of the number of references. This is similar to the behavior of importing Python modules, and to extensions of various kinds in other platforms as well. I retract my request for cacheing Elvish modules; it simply does not fit.

We contrasted with Fish, which is effectively pretending to "cache" its functions when it is actually reloading its functions. That puts the burden on the user to be extremely careful with their function definitions to avoid issues from reloading. Maybe that is an acceptable burden, but I do not think that it is presented clearly enough to Fish's (potential) users. I have been wary when writing my own Fish functions, so I can readily believe @krader1961 when he describes the potential horror of that "cacheing".

I am a little concerned about the discussion getting argumentative and heated. That is not my desire, nor does it serve the community. Of course, the tech industry is nonetheless famous for it. I am trying--for my part--to limit it. My primary concern is the user experience, mine as well as others.

As a general rule-of-thumb from widespread experience, each issue actually raised directly by a user is known to be representative of many more silent users that encountered the same issue but will never report it. How many more silent users exist is debated, but the various measures and educated guesses seem to range from about ten to as many as a thousand silent users for EACH user that actually complains directly.

Here are some references based on a Google search of "how many customers have the same complaint", in no particular order:

I acknowledge that much of what I present and say is likely unique to me in many--if not most--ways. However, please consider that I am also representative of numerous other users in some--perhaps many--of the issues I try to address.

xiaq commented 2 years ago

Hi, thanks for your interest in Elvish.

I'll expand the documentation for use a bit to hopefully make its behavior clearer. The relative import mechanism is loosely modelled after TypeScript's. It's mainly intended to make it easier for a script or a module to import another module.

Can Elvish be the language of that essential "glue" that turns a bunch of individual tools into a seamless Continuous Deployment pipeline?

In general being "glue" is the appeal of shell languages, so I'd say yes. Not 100% about how well this will fit into CD pipelines in particular though.

Will the Elvish community be willing to embrace an audience that is attempting to build and support such pipelines across workstations, VMs, containers, and clouds around the world?

Whether Elvish will adopt certain features depends more on someone devoting the time and energy to come up with a concrete design and implementation (and having it reviewed by me) than community consensus. I think most people wouldn't mind Elvish getting cross-machine functionalities as long as it doesn't degrade their current workflows of working on one machine at a time.

Personally I've been playing with the idea of allowing Elvish to transparently invoke commands on a remote machine, kind of like Emacs's "tramp mode". Processes are a bit more complex than files but the difficulty is probably surmountable. It's not my priority though.

xiaq commented 2 years ago

And regarding this point in particular:

The target audiences are site reliability engineers (SREs), DevOps (and DevSecOps) engineers, and developers in general. My hope is to address a massive opportunity that I perceive regarding the techniques, the tools, and the examples that currently exist.

Ilya Sher and Andy Chu have both written a lot on the potential of shell in the cloud age. One may feel that at this point we should already have a really mature "cloud-native" shell - AWS's EC2 and S3 were launched in 2005 - but we don't really have that.

I think It's just because the intersection of people with advanced, custom Cloud workflows and the people capable of designing and implementing a shell language is too small. I'm certainly not in the former set. (Andy Chu would probably say it's because other people are doing it wrong by not providing an upgrade path from Bash though ;-)

rdw20170120 commented 2 years ago

Thanks @xiaq for your comments and encouragement. I am very impressed with Elvish, and I thank you for your efforts!

I did not mean to imply distributed shell processing in my discussion. I feel that Immutable Servers are an excellent and essential pattern, so I do not intend to do much of anything with remote shells (perhaps just troubleshooting on rare occasion).

Thank you so much for the references. Those are some excellent blogs and discussions about this whole realm and related patterns. I will be reading for days... :)

elves / elvish

Advice on how best to use Elvish for development and infrastructure automation #1401