grafana / alloy

OpenTelemetry Collector distribution with programmable pipelines
https://grafana.com/oss/alloy
Apache License 2.0
1.32k stars 181 forks source link

Proposal: Change our makefile process #826

Open mattdurham opened 4 months ago

mattdurham commented 4 months ago

Goal

Make our build system easier to user and change.

Issues

Alloy developers are not experts in Make

Make requires a lot of knowledge around bash and make itself. Whitespace is important and the specific syntax often requires knowledge of Make. For example how to set an environment variable with the various operators: :=,=,?=.

Complexity

The builds and releases are getting more complex and Make syntax is not well suited to heavy use of conditionals and logic flow.

Debuggability

Make is hard to debug, knowing the state of each var and how they change requires careful reading. Passing along -v can help but also generates a lot of data to comb through. Due to our recursive nature in calling makefiles figuring out the current start can be hard and how to use it. See PROPAGATE_VARS for an example, where some vars need to be added and some don't.

Calling from parent to child to parent makefile operations

The parent makefile calls into a child makefile (packaging) that then calls

Proposals

Proposal to move from Make to MageFile

MageFile is a tool used to build software similar to Make but is written in Go.

Alloy developers are primarily go developers and Magefile is written in go

This allows us to leverage the go knowledge and libraries that we use everyday.

Reasonable to Debug

By building the cmd yourself you can debug what is happening inside the Magefile and step through it like any other program.

Intellisense

Because Magefile is go code you get IDE and editor support.

Can be compiled into executables reducing dependencies

No requirement for Magefile installation or Make installation. Reducing the number of dependencies. Can build targets with the compiled Magefile and Docker. Reducing the number of dependencies required.

Compile time checking

More syntax can be handled at compile time versus runtime.

Already used within Grafana

Grafana uses magefiles for building plugins and the windows build. K6 uses magefile to build their binaries.

Can allow us to leverage more tooling

Though doable in makefile doing things like asking the user if they want to install docker or golangci or other necessary tools is easier in Magefile.

Disadvantage: More verbose

Make is very terse and go may require more lines of code to do the same action.

Keep it as it is

Don't change anything and allow it to increase in scope and size.

Autogenerate the makefiles like we do for drone.yml but remove the recursive calling and be verbose

Use go:generate or jssonnet to generate a makefile for us. This would be one very large but also very explicit makefile. This would reduce the number of vars to almost zero and remove the recursive calling. Instead of using USE_CONTAINER each target that today could be used with container would instead have an explicit target for building in a container.

Alternatives

Looked at what was available that was written in Go or used it natively. There is not much that I could find, both zim and task use yaml based format. There are many Make like projects but I avoided away from anything that was Make like. Replacing one Make with another similiar tool doesn't solve the problem.

Notes

There is a rough Proof of Concept available that implements 70% of the Makefile in MageFile. The existing Makefile can remain until we are sure the MageFile meets all functionality of the current Makefile.

rfratto commented 4 months ago

I'm interested in seeing alternative solutions to the problems presented here with the current build system. I don't think this is a binary choice of Make vs Mage, and other options could be:

We barely touch the build system except for bumping the version of the build image we use, so I'm worried about whether this is the most efficient use of our limited development time. Rewriting anything is going to lead to bugs and have a spike of maintenance burden, so we should be pretty confident that's where we need to be spending our time.

mattdurham commented 4 months ago

Sure, I can add some additional context and options. The rewrite here should be fairly easy to test, the output should ideally be the exact same file/images if we pin the build tags. The non building things like tests and the like should be easy to check too.

hairyhenderson commented 4 months ago

My 2 cents as a semi-outsider... I too considered moving to Mage some time ago both for an OSS project and for an internal work project.

I ultimately decided to stick with Make, and this is some of what I learnt:

I get the attractiveness of Mage, and I get the trepidation that comes from Make. Make is a victim of its own simplicity sometimes - it can seem very much like it's just a kind of shell script, but once I unlearned that and went back to the docs, I found I was much more productive with Make.

jkroepke commented 4 months ago
  • there's a good reason Make has endured for almost 50 years, and it's used almost everywhere in some form ...
  • many potential contributors are comfortable with the idea of just typing make to get up and running

I would not agree here. Youngstars have serious issues with shell scripting and I also have some serious issues to understand the Makefile here.

Looking to other languages, the Javascript eco system is primary using npm scripts. Java projects more using maven/grade targets which are written in a developer understandable language. Even go itself is not using a Makefile anymore.

Additionally, sometimes the Makefile is not sufficent and have to move some logic to an shell script, e.g.: https://github.com/grafana/alloy/blob/main/tools/image-tag

This is the point, where is shell scripting hell begins. In the development, you mainly restricted to Bash 3.2 bash feature, because it's the oldest available bash function on MacOS. grep and sed are limited as well and behave differently between MacOS and GNU. The correct way would be POSIX compliance but that is going even more worse.

And you can't write any tests for the build system.

After 50 years, I recommend to switch to and modern build system, that also works on any OS by design.

mattdurham commented 4 months ago

Updated the proposal with a more neutral approach and some other options.

rfratto commented 4 months ago

There's a few high-level questions I think are prerequisites to properly evaluating our options here:

  1. What functionality does our build system cover today?
  2. What subset of that functionality is causing us toil?
  3. How frequent is that toil?
  4. How much of that subset needs to be done inside the build system?

These questions presume exploring more middle-ground options which are currently missing from the proposal, such as moving the functionality that causes toil out of the Makefile.

csh0101 commented 4 months ago

There's a few high-level questions I think are prerequisites to properly evaluating our options here:我认为有一些高级问题是正确评估我们的选择的先决条件:

  1. What functionality does our build system cover today?今天我们的构建系统涵盖哪些功能?
  2. What subset of that functionality is causing us toil?该功能的哪个子集导致我们劳累?
  3. How frequent is that toil?这种辛劳有多频繁?
  4. How much of that subset needs to be done inside the build system?该子集有多少需要在构建系统内完成?

These questions presume exploring more middle-ground options which are currently missing from the proposal, such as moving the functionality that causes toil out of the Makefile.这些问题假定探索提案中目前缺少的更多中间选项,例如将导致繁重工作的功能从 Makefile 中移出。

good idea!