apache / pekko

Build highly concurrent, distributed, and resilient message-driven applications using Java/Scala
https://pekko.apache.org/
Apache License 2.0
1.17k stars 139 forks source link

Create sbt plugin for header check? #210

Open mdedetrich opened 1 year ago

mdedetrich commented 1 year ago

So I just realized that we probably have to abstract all of the sbt-header-check logic that is done for checking the Apache copyright headers into its own sbt plugin because in the current setup, while it does correctly check for the sources that are part of the main project, it doesn’t check for the sources that are extending the sbt build (i.e. *.scala files in project/).

Theoretically to solve this you would have to add do addSbtPlugin("de.heikoseeberger" % "sbt-header" % "5.7.0") in project/project/plugins.sbt (take note of the additional nested project folder) so that the plugin applies to the sbt source files however we would also have to redefine all of the logic in CopyrightHeader.scala (hence the need to put it into a sbt plugin).

@pjfanning What do you think?

pjfanning commented 1 year ago

Possibly, a new sbt plugin might help - sbt-header works pretty well but it does seem to have its intricacies. @He-Pin created a CopyrightHeaderForBuild.scala in incubator-pekko and it seems to update some project scala files but not all. Not really sure why it seems to update some but certainly seeing many that don't get updated. In the end, we can hand modify some files if it is easier.

mdedetrich commented 1 year ago

So to clarify, I am not talking about re-implementing sbt-header. Rather the proposed sbt plugin that I speak of would just be an AutoPlugin that extends sbt-header with the logic in CopyrightHeader, so you can think of it more as just abstracting away our current header checking into a library.

it seems to update some project scala files but not all. Not really sure why it seems to update some but certainly seeing many that don't get updated. In the end, we can hand modify some files if it is easier.

This is due to what I mentioned, sbt plugins/settings generally only apply to the parent project, not itself. It might be hard to conceptualize, but basically any plugins in project only apply to the root source, so if you want a sbt plugin to apply to the sources of an sbt project then you need to add the plugin to project/project/plugins.sbt (and you can do this endlessly but in our case we only need to do it once).

Even aside of this limitation, there is merit at last at some point in having this as an sbt plugin because just like the other sbt plugins we created, this is going to be common logic that will have to apply to every single Pekko project.

mdedetrich commented 1 year ago

@pjfanning Due to the recent work that is being done on header checks, I have thought about this again and I think the case for making a sbt plugin to specifically check for pekko headers is a good one. There is a large amount of code being duplicated and its good to have a centralized place for the header check logic so it can be audited better.

If you have no qualms would it be possible to create an incubator-pekko-sbt-header github repo in the apache org just like with https://github.com/apache/incubator-pekko-sbt-paradox?

pjfanning commented 1 year ago

I'd prefer to create sbt plugins outside of Apache. We end up with all the Apache release rules and other things like that.

mdedetrich commented 1 year ago

I was personally contemplating about whether to make an sbt-apache-header vs a incubator-pekko-apache-header. The reason why I was suggesting the latter is because we do have some bespoke pekko stuff, i.e. our own Apache License header is not the standard one e.g.

/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * license agreements; and to You under the Apache License, version 2.0:
 *
 *   https://www.apache.org/licenses/LICENSE-2.0
 *
 * This file is part of the Apache Pekko project, derived from Akka.
 */

Now this isn't the worst problem in the world, i.e. I can make this configurable via an sbt setting but it would meant that at least for the standard license header we would have to configure this in every project.

Specifically regarding having to deal with the ceremony of dealing with ASF release, as far as I understand this is a non issue until we decide to make the first release of Pekko (i.e. we would treat it the exact same way as incubator-pekko-apache-header and just publish snapshots).

As a compromise and I guess a technically more well designed solution, I can create sbt-apache-header under my own repo, configure the apache header manually in each Pekko project and when the time is more fitting we can create a incubator-pekko-apache-header in the future which would just extend sbt-apache-header with the custom Pekko stuff. Is this preferable for you?

pjfanning commented 1 year ago

seems fine

mdedetrich commented 1 year ago

We would also have to duplicate the license check, i.e .making an sbt setting in sbt-apache-header for custom license checks (i.e. additionalLicenseChecks) where we would then add a check for the Lightbend license. The lightbend license check would have to be duplicated in every Pekko repo, as well as additional license header checks for the other cases.

pjfanning commented 1 year ago

I'm not sure we need to check for whether the files have Lightbend headers. We should just stick our header on top anyway - regardless of whether it has a Lightbend header or not. Just to be clear, we need to keep the existing headers - all we need to do is check if the file has our header and if not then add it on top without changing any existing text.

mdedetrich commented 1 year ago

I'm not sure we need to check for whether the files have Lightbend headers

We are currently checking for this, and iirc the way the check is codes is to make sure that someone doesn't remove an existing Lightbend header (which is possible since sbt header gives you the previous and current state).

Should confirm this.

mdedetrich commented 1 year ago

We are currently checking for this, and iirc the way the check is codes is to make sure that someone doesn't remove an existing Lightbend header (which is possible since sbt header gives you the previous and current state).

So I just checked pekko core and my understanding of this check was incorrect, i.e. it doesn't check if you remove a Lightbend header from an existing source file. I think this is technically possible and should be looked into, and would also be useful in sbt-apache-header i.e. having an existingHeaderFileCheck key which ensures that for existing source files a specific header isn't removed (in our case Lightbend copyright), but will not add this header to new source files.

He-Pin commented 1 year ago

I think this should be good, and there are many common settings for pekko and satellite projects.

mdedetrich commented 1 year ago

As a compromise and I guess a technically more well designed solution, I can create sbt-apache-header under my own repo, configure the apache header manually in each Pekko project and when the time is more fitting we can create a incubator-pekko-apache-header in the future which would just extend sbt-apache-header with the custom Pekko stuff. Is this preferable for you?

So I just thought about this more comprehensively and I at least came to the conclusion that due to how bespoke the situation with headers is for Pekko (as mentioned before even our standard Apache header is unique!) this is not really worth it because you would end up spending the same amount (if not more) effort in doing all of these bespoke changes in each Pekko repo which kind of defeats the purpose.

For this reason at least in my mind we either implement an incubator-pekko-sbt-header (which would handle everything, i.e. our custom Apache header/Lightbend copyright etc etc) or not at all, however as @pjfanning stated when taking into account what happened historically along with possible future implications this ends up taking a lot of our bandwidth, something which is a precious resource. I could create an incubator-pekko-sbt-header under my own org just to bypass this but I don't know how this would fly (and whether it would create even more problems in the future once we do end up moving it to Apache Pekko project, which is where it should sit).

One counter argument I can think of is that by virtue of having everything centralized in incubator-pekko-sbt-header it might make the incubation process faster because its easier to audit so it could go a long way in cancelling out the expected extra process effort, but that is one big assumption.