apache / incubator-xtable

Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
https://xtable.apache.org/
Apache License 2.0
919 stars 147 forks source link

Add license header #359

Closed wuchunfu closed 8 months ago

wuchunfu commented 8 months ago

Add license header

wuchunfu commented 8 months ago

This is great! Will review deeply. QQ: can we also introduce RAT plugin to complain about lack of license ?

@vinothchandar With this in mind, I would like to add a CI to check the license header in the next PR

wuchunfu commented 8 months ago

Overall, it looks good to me 👍 I presume you used some tool to add or fix the header. As Vinoth mentioned, was it the RAT plugin? Previously, we used the spotless plugin to format the files and add headers. I hope its policies do not conflict with the lines that were added or removed by this plugin.

@ashvina I will use Apache Skywalking eyes to check for more comprehensive GitHub actions

wuchunfu commented 8 months ago

Hi @wuchunfu. I checked out your branch and ran spotless on it (mvn spotless:apply). It removed the lines that were added by the plugin you used. This would cause the build to fail because of formatting. Could you please make this PR only fix the header and add missing headers, without reformatting the files? If you agree, could you apply spotless and update this PR?

@ashvina Okay, I'll update the PR now. Thank you for your reminder

wuchunfu commented 8 months ago

@ashvina @vinothchandar PTAL, thanks.

the-other-tim-brown commented 8 months ago

@wuchunfu is there any way to update the existing plugin configuration to enforce this? https://github.com/apache/incubator-xtable/blob/main/pom.xml#L539

wuchunfu commented 8 months ago

@wuchunfu is there any way to update the existing plugin configuration to enforce this? https://github.com/apache/incubator-xtable/blob/main/pom.xml#L539

@the-other-tim-brown https://github.com/apache/incubator-xtable/pull/364 This PR is specifically designed to check the license header

the-other-tim-brown commented 8 months ago

@wuchunfu is there any way to update the existing plugin configuration to enforce this? https://github.com/apache/incubator-xtable/blob/main/pom.xml#L539

@the-other-tim-brown #364 This PR is specifically designed to check the license header

@wuchunfu I saw that but I'm wondering if we can use the existing plugin which performs this check on the java classes. Perhaps it can be expanded to include more file types and we can still get the benefits of the spotless plugin's ability to automatically add this header to files for us.

wuchunfu commented 8 months ago

@wuchunfu is there any way to update the existing plugin configuration to enforce this? https://github.com/apache/incubator-xtable/blob/main/pom.xml#L539

@the-other-tim-brown #364 This PR is specifically designed to check the license header

@wuchunfu I saw that but I'm wondering if we can use the existing plugin which performs this check on the java classes. Perhaps it can be expanded to include more file types and we can still get the benefits of the spotless plugin's ability to automatically add this header to files for us.

@the-other-tim-brown By executing the mvn spotless: apply command, it is possible to automatically add license headers to Java files without adding license headers

wuchunfu commented 8 months ago

@wuchunfu is there any way to update the existing plugin configuration to enforce this? https://github.com/apache/incubator-xtable/blob/main/pom.xml#L539

@the-other-tim-brown #364 This PR is specifically designed to check the license header

@wuchunfu I saw that but I'm wondering if we can use the existing plugin which performs this check on the java classes. Perhaps it can be expanded to include more file types and we can still get the benefits of the spotless plugin's ability to automatically add this header to files for us.

@the-other-tim-brown Perhaps we want to achieve automatic addition of license headers for more files, which can be achieved through https://github.com/apache/skywalking-eyes?tab=readme-ov-file#fix-license-headers or https://github.com/apache/skywalking-eyes?tab=readme-ov-file#docker-image

zabetak commented 8 months ago

I tend to agree with the general feeling that spotless plugin should be able to check and add missing headers for most file types. If there is something that it can't be done (or it is very difficult to achieve) then we can discuss about skywalking-eyes and other options.

wuchunfu commented 8 months ago

I tend to agree with the general feeling that spotless plugin should be able to check and add missing headers for most file types. If there is something that it can't be done (or it is very difficult to achieve) then we can discuss about skywalking-eyes and other options.

@zabetak I agree with your suggestion, thank you for your guidance. I will use spotless-maven-plugin later to implement the automatic addition of license header function

wuchunfu commented 8 months ago

@the-other-tim-brown @zabetak @vinothchandar PTAL, thanks

wuchunfu commented 8 months ago

@the-other-tim-brown @zabetak @vinothchandar PTAL, thanks

wuchunfu commented 8 months ago

@jcamachor PTAL, thanks

jcamachor commented 8 months ago

@xtable-bot run azure

the-other-tim-brown commented 8 months ago

@wuchunfu can you look into the CI failures? Looks like we may need some configs set.

wuchunfu commented 8 months ago

azure

@the-other-tim-brown I compile and package normally in my development environment, and I am not very familiar with Azure CI. However, the execution time of this CI is too long now. Can we consider using GitHub Actions? If possible, I am willing to do this

wuchunfu commented 8 months ago

@the-other-tim-brown @zabetak @vinothchandar @jcamachor PTAL, thanks

wuchunfu commented 8 months ago

Globally the changes fall into two buckets:

  • License header checks & addition
  • Code-style & formatting

The bigger the diff the harder it is to review, iterate and merge. I would suggest to focus exclusively on license header in this PR and then we discuss/apply potential formatting changes in a separate patch.

@zabetak Thank you for your suggestion. Initially, I only added the license header, but others thought it was necessary to use spotless-maven-plugin for automation, so I don't know how to better handle this matter

zabetak commented 8 months ago

@wuchunfu I think my previous comment was a bit unclear. I agree with others that spotless-maven-plugin should be used. My suggestion was to modify the spotless configuration so that it only touches the license header and not the style in general.

wuchunfu commented 8 months ago

@wuchunfu I think my previous comment was a bit unclear. I agree with others that spotless-maven-plugin should be used. My suggestion was to modify the spotless configuration so that it only touches the license header and not the style in general.

Okay, I understand what you mean. I will split this PR

wuchunfu commented 8 months ago

@the-other-tim-brown @zabetak @vinothchandar @jcamachor PTAL, thanks

wuchunfu commented 8 months ago

@xtable-bot run azure

wuchunfu commented 8 months ago

@xtable-bot run azure

ghost commented 8 months ago

CI report:

Bot commands @xtable-bot supports the following commands: - `@xtable-bot run azure` re-run the last Azure build
wuchunfu commented 8 months ago

@the-other-tim-brown @zabetak @vinothchandar @jcamachor PTAL, thanks

wuchunfu commented 8 months ago

Changes LGTM, and CI is passing.

I have just a final question before approving. Is the demo still working after adding the header in the various property/config files? I am not sure if it is covered by CI can you please check.

@zabetak It seems that no CI coverage was found, and the demo is working properly. However, if we add CI to run the demo, it will take a long time. I tested it in the development environment and found that the image size is about 9g, so it is not recommended to run the demo in CI

wuchunfu commented 8 months ago

Hi @wuchunfu, thank you for your contributions and for addressing the review comments. Before we proceed with the merge, I have one final request. In accordance with Apache guidelines, squash and merge is disabled for PRs [1]. Consequently, all commits within a PR must be squashed into the PR branch prior to merging. Could you please squash the commits so we can complete the PR? Thank you once again.

[1] https://lists.apache.org/thread/nxj41bjwmbxd6x1vrjt1p4gl3bgw5mct

@ashvina I think this can be set in GitHub -->Settings -->General, which can better complete the PR merge work. Of course, I can also do this

image
ashvina commented 8 months ago

The squash-merge button was disabled on purpose, see this: https://github.com/apache/incubator-xtable/blob/179d19738b3c6b2b76db54e04dd7122ea41e538c/.asf.yaml#L29

This action was taken to preserve the author's email address in the commit history, which is overwritten if GitHub's merge option is used. Further details can be found in the email link I had shared in my last comment or in the comments in the .asf.yaml file.

zabetak commented 8 months ago

Since we are all good with reviews I will try to squash the commits myself and merge via command line to see if this can be used as an option. I will update here with the results.

zabetak commented 8 months ago

Merged via https://github.com/apache/incubator-xtable/commit/7c7b54b3870791578b4689b29cf8794866eefa52

Apart from the squash, I did a small fixup to properly exclude the target directory since when building locally I encountered some failures for files present under the target directories.

FYI: Although it is not possible to push via the CLI to GitHub repo it turns out that it is possible to bypass the branch protection hook by pushing to the GitBox repo. I don't know if it is good practice to start doing that but I wanted to test if it's possible. I will start a thread to dev@ for letting others know and discuss further.

zabetak commented 8 months ago

Many thanks for the PR @wuchunfu !