checkstyle / contribution

some useful sources that should not stay in main repo but it is good to host them
GNU Lesser General Public License v2.1
49 stars 130 forks source link

checkstyle-tester: Yaml format for project list #866

Open romani opened 3 months ago

romani commented 3 months ago

We have now very weird format of project list. https://github.com/checkstyle/contribution/blob/master/checkstyle-tester/projects-to-test-on.properties It was quick and old decision. It was format for bash language, we deprecated this version a long time ago. Let's do yaml format for it. All long lists, should become elements of array in yaml.

Conceptually, we can use any other type that will allow us to make single line in existing format to be multiple lines that easily to manage.

Look at jdk lines to feel a pain of existing format.

existing: #checkstyle|git|https://github.com/checkstyle/checkstyle.git|master|**/.ci-temp/**/*,**/resources-noncompilable/**/asttreestringprinter/**/*,**/resources-noncompilable/**/filefilters/**/*,**/resources-noncompilable/**/main/**/*,**/resources-noncompilable/**/suppressionsstringprinter/**/*,**/resources-noncompilable/**/gui/**/*,**/resources-noncompilable/**/javadocpropertiesgenerator/**/*,src/test/resources-noncompilable/com/puppycrawl/tools/checkstyle/javaparser/InputJavaParser.java,**/InputAllEscapedUnicodeCharacters.java,**/resources-noncompilable/**/javaparser/InputJavaParser.java,**/resources-noncompilable/**/checks/imports/unusedimports/InputUnusedImportsSingleWordPackage.java,**/resources-noncompilable/**/grammar/java19/*,**/resources-noncompilable/**/treewalker/**/*

proposed:

projects:
  - sevntu-checkstyle:
    - scm: git
    - url: https://github.com/sevntu-checkstyle/sevntu.checkstyle
    - reference: master
  - spotbugs:
    - scm: git
    - url: https://github.com/spotbugs/spotbugs
    - reference: 3.1.2
  # Those projects are quite old and have lot of legacy code
  - apache-jsecurity:
    - scm: git
    - url: https://github.com/apache/jsecurity
    - reference: c2ac5b90a467aedb04b52ae50a99e83207d847b3
  - apache-struts
    - scm: git
    - url: https://github.com/apache/struts.git
    - branch: master
    - excludes:
      - **/apache-struts/**/resources/**/*
  - checkstyle:
    - scm: git
    - url: https://github.com/checkstyle/checkstyle.git
    - reference: master
    - excludes:
      - **/.ci-temp/**/*
      - **/resources-noncompilable/**/asttreestringprinter/**/*
      - **/resources-noncompilable/**/filefilters/**/*
      - **/resources-noncompilable/**/main/**/*
      - **/resources-noncompilable/**/suppressionsstringprinter/**/*
      - **/resources-noncompilable/**/gui/**/*
      - **/resources-noncompilable/**/javadocpropertiesgenerator/**/*
      - src/test/resources-noncompilable/com/puppycrawl/tools/checkstyle/javaparser/InputJavaParser.java
      # 'InputAllEscapedUnicodeCharacters' must be skipped because it is too big and slows down JXR
      - **/InputAllEscapedUnicodeCharacters.java
      - **/resources-noncompilable/**/javaparser/InputJavaParser.java
      - **/resources-noncompilable/**/checks/imports/unusedimports/InputUnusedImportsSingleWordPackage.java
      - **/resources-noncompilable/**/grammar/java19/*
      - **/resources-noncompilable/**/treewalker/**/*

main reason of updaate is exclusion list, that is long, especially i jdk.

usage now:

  sed -i'' 's/^guava/#guava/' projects-to-test-on.properties
  sed -i'' 's/#apache-struts/apache-struts/' projects-to-test-on.properties
  groovy ./diff.groovy --listOfProjects projects-to-test-on.properties \
      --patchConfig checks-nonjavadoc-error.xml  -p "$BRANCH" -r ../../..  \
      --useShallowClone \
      --allowExcludes --mode single -xm "-Dcheckstyle.failsOnError=false"

will be:

  groovy ./diff.groovy --listOfProjects projects-to-test-on.yml  -projects=apache-struts,guava \
      --patchConfig checks-nonjavadoc-error.xml  -p "$BRANCH" -r ../../..  \
      --useShallowClone \
      --allowExcludes --mode single -xm "-Dcheckstyle.failsOnError=false"

we do not need Enable/Disable of projects, we just need list projects as parameter something like -projects=apache-struts,guava. No -projects will mean use all.

nrmancuso commented 3 months ago

@romani please share an example of the format of this yaml

relentless-pursuit commented 3 months ago

Also, will the extenion change?

rnveach commented 3 months ago

I detailed the impact of this kind of change at https://github.com/checkstyle/contribution/pull/867#issuecomment-2133719970 and https://github.com/checkstyle/contribution/pull/867#issuecomment-2133724713. This type of change will be slightly mildly involved.

nrmancuso commented 3 months ago

Yeah, the issue description really doesn’t capture the large amount of work that will need to be done to support this.

romani commented 3 months ago

@nrmancuso , @rnveach , I updated description with more details.

nrmancuso commented 3 months ago

@romani please show the full document format in valid yaml. Also, we should use yaml/yml file extension

romani commented 3 months ago

I placed missed ":" now yaml is valid.

nrmancuso commented 3 months ago

IMO, project properties should not be array elements.

rnveach commented 3 months ago

Implementation specifics is still being discussed.

romani commented 3 months ago

project properties should not be array elements

Please share snippet of config that you have in your mind

nrmancuso commented 3 months ago

yaml structure should be:

projects:
  - checkstyle:
    scm: git
    url: https://github.com/checkstyle/checkstyle.git
    branch: master
    enabled: true
    excludes:
      - **/.ci-temp/**/*
      - **/resources-noncompilable/**/asttreestringprinter/**/*
      - **/resources-noncompilable/**/filefilters/**/*
  ...

Override can be provided by:

  groovy ./diff.groovy --listOfProjects projects-to-test-on.yaml  \
      --patchConfig checks-nonjavadoc-error.xml  -p "$BRANCH" -r ../../..  \
      --useShallowClone \
      --allowExcludes --mode single -xm "-Dcheckstyle.failsOnError=false" \
      --projects.checkstyle.enabled=false
romani commented 3 months ago

@nrmancuso , I probably see right now point of confusion. I updated description to show that config will be list of projects, not a single project.

nrmancuso commented 3 months ago

@nrmancuso , I probably see right now point of confusion. I updated description to show that config will be list of projects, not a single project.

I am not confused at all, properties for individual projects should not be an array.

A “project” is an object and should be treated that way. Yaml is essentially json, if that helps you to conceptualize this idea.

romani commented 3 months ago

in this case I do not undersatnd what --projects.checkstyle.enabled=false is going to do.

nrmancuso commented 3 months ago

in this case I do not undersatnd what --projects.checkstyle.enabled=false is going to do.

It would override the value of this property.

romani commented 3 months ago

and we need --projects.sevntu-checkstyle.enable=false ??? Am I reading you mind properly ?

nrmancuso commented 3 months ago

and we need --projects.sevntu-checkstyle.enable=false ??? Am I reading you mind properly ?

Sure. We shouldn’t have to have more than one project file anywhere. Default can be disabled, and we enable by override. This way, we can override other properties too, like the reference.

romani commented 3 months ago

We shouldn’t have to have more than one project file anywhere.

This is not a way I planned to use this file. I see your idea as separate engagements that we can do later on, it will not be conflicting with my proposal. Right now I don't see where your approach will be useful, but in future I might see.

If you prolong your idea, you will see that you don't need file, you can provide all fields as arguments. You don't need a file at all. It is good strategy maybe, but it is very separate feature, let's not mix it to this issue proposal.

This issue is about to make config multi line.

nrmancuso commented 3 months ago

Ok, we can use circular dependency on project list in arguments, but in either case, we shouldn’t have a bunch of these files in the config repo; this sort of repetition is indicative of poor design. If anything, we should have exactly one file with all projects, then a simple yaml file with a list of project names for some checks that require special projects, which can be used to compose your list of projects command.

We should not have to maintain 50+copies of the same url, etc

romani commented 3 months ago

sort of repetition is indicative of poor design.

Dependency cost more than copy paste. In script/application we will have single source, in final result numerous copies to never dependent on anything, just completely independent and fully specified files as we already accustomed to use by gists.

We just providing an option to not craft in gist something, and reuse ready to use "compiled" configs.


This issue is about to make config multi line, let's not put here any future GitHub actions.


You might look at it from perspective of our cI jobs, that does disable and enablement of projects.

I look at at it from perspective "run on all".

So it might be point of misunderstanding.

CI execution just need to stop using config, and take config from command line. Sharing same config for two different reasons is not good. We should split .