Shouldn't we be checking all files in the repo?

2bndy5 commented 3 years ago

This occurred to me when I was adding the diff-only option. If diff-only is false, then shouldn't we be examining all source files in the repo that use one of the designated file extensions? Currently, we are only examining files that have been changed despite what diff-only option is set to.

However, we can only do this if the repo is checked out before this action is run, but there's the concern of clever people also checking out their repo's submodules (which shouldn't be examined by this action). In case the repo is setup to use submodules, we can use the .gitmodules file to ignore any existing submodules. This wouldn't be hard at all since the .gitmodules file uses an ini file syntax which python's std configparser module can handle nicely.

I suppose we can use the REST API to list all files in the repo, but the REST API's responses are limited to 30 entries per page. If the repo has more than 30 files, then we'd have to subsequently keep making REST requests until we have a list of all the files in the repo. It would definitely be easier if the repo was checked out beforehand, and we use python to "walk" through the repo's files and examine any applicable source files.

shenxianpeng commented 3 years ago

It’s best to help users check or output prompts to tell them that they need to have uses: actions/checkout@v2 to checkout repo before running cpp-linter-action. Otherwise, they may also encounter the same problem as me in #24.

If their code has a submodule or subtree, they should also check out if they need it. cpp-linter-action only focuses on the work of clang-format and clang-tidy.

So if the repo was checked out beforehand, there is no need to download the source code through the REST API, right?

2bndy5 commented 3 years ago

It’s best to help users check or output prompts to tell them that they need to have uses: actions/checkout@v2 to checkout repo before running cpp-linter-action.

I did alter the example yml in the readme with a comment about this. I do like the idea of having something in the log about non-existent files. Currently, all we're logging about this is

INFO:CPP Linter:Downloading file from url: https://github.com/shenxianpeng/test-cpp-linter-action/raw/fd6ec9c55cf93df9e01dad042f80e7835fc2fa5f/demo.cpp

but it should say more to make the user aware that the action is trying to recover from no git checkout.

So if the repo was checked out beforehand, there is no need to download the source code through the REST API, right?

True, but it doesn't hurt too much to leave it in as a recourse.

If their code has a submodule or subtree, they should also check out if they need it. cpp-linter-action only focuses on the work of clang-format and clang-tidy.

Checking out submodules is often done for complex projects' build CI. Furthermore, the compiler database file that clang-tidy looks for is generated at build time (ie using cmake -bbuild .). If we do look in the user's repo for all source files using one of the specified extensions, then this action will inadvertantly crawl any submodule(s) that were also checked out. If this issue becomes a desired feature, then I'd use the .gitmodules file (something generated and preferably maintained by git bash commands) to ignore any source files in the checked out submodules. Did I understand your comment correctly?

2bndy5 commented 3 years ago

example of a .gitmodules file

```ini [submodule "RF24"] path = RF24 url = https://github.com/nRF24/RF24.git [submodule "RF24Network"] path = RF24Network url = https://github.com/nRF24/RF24Network.git [submodule "RF24Mesh"] path = RF24Mesh url = https://github.com/nRF24/RF24Mesh.git [submodule "pybind11"] path = pybind11 url = https://github.com/pybind/pybind11.git ```

The paths are relative to the repo's root (which can be specified using the input option repo-root).

2bndy5 commented 3 years ago

I added a conditional warning in the log about missing files when no git checkout was used.

shenxianpeng commented 3 years ago

If their code has a submodule or subtree, they should also check out if they need it. cpp-linter-action only focuses on the work of clang-format and clang-tidy.

I mean we don't need to consider submodules, users should consider it by themselves. The working process of cpp-linter-action is to have a complete code, then to analyze the specified code changes and post results comments.

I don't know if I misunderstand your comment about submodules.

shenxianpeng commented 3 years ago

I added a conditional warning in the log about missing files when no git checkout was used.

Tested and saw the warning looks good in log

2bndy5 commented 3 years ago

Yeah I think there's a misunderstanding. I'm trying to say the same thing you are: Don't pay attention to submodules. Although I keep going into detail about how to identify if a path in the repo is a submodule (using .gitmodules).

The important thing I'm questioning is:

Should we examine all source files or only the files that changed?

Currently, we are only examining the files that changed.

set the diff-only option to "true" if examining only changed files
set the diff-only option to "false" if examining all applicable source files

I think I'll start another branch on my fork to see what this looks like.

shenxianpeng commented 3 years ago

I think I see. most CI checks are only checking the changed files, such as SonarQube Quality Gates. I would like to set diff-only option to "true" by default. In my case examining only changed files is good, if the codebase needs format I'd like to do a bulk change commits before, then introduce an action for future changed files examination.

2bndy5 commented 3 years ago

The way I have it setup now (in master-py branch): If we set the diff-only option to "true", it will only check the lines (of the changed files) shown in the diff. Do you still want to check the entire file if diff-only is true?

shenxianpeng commented 3 years ago

Oh... diff-only refers to lines, not files. I tested diff-only to "true" in the test-cpp-linter-action repo, compared with diff-only is "false", I don't see any different output, maybe the test code is not enough. so I'm not sure if need to changed diff-only option to "true", I follow your opinion.

2bndy5 commented 3 years ago

The diff-only option was meant as a entrypoint to posting comments directly on the line in the file (#11), but it may not be as useful as I thought. The difference between lines and files does seem confusing. Maybe I should rename that to changed-lines-only (or diff-lines-only) instead. Then, I could add an option called changed-files-only. Can you think of better names? (I seem terrible with naming things).

shenxianpeng commented 3 years ago

In GitHub, the diff files button called "Files changed", maybe we name them files-changed-only and lines-changed-only? Your naming has always been very professional, and my pleasure to accept any changes.

2bndy5 commented 3 years ago

I like your rationality. I'm going with your name ideas. TBC...

2bndy5 commented 3 years ago

Quick Update (not committed yet)

I was easily able to implement the files-changed-only & lines-changed-only options while using a .gitmodules file (if it exists) to ignore any present submodules.

The exclusion of submodules when files-changed-only is set to true led me to think about any other directories (or specific files) that the user might want to ignore. So, I also added a ignore option that takes a semicolon-separated list of directories and/or files that should also be ignored. One caveat is that if a specific file is given to the ignore option, then the file's relative path must be included in the file's name.

      - uses: actions/checkout@v2
      - uses: shenxianpeng/cpp-linter-action@master
        id: linter
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        with:
          style: file
          # search entire repo for source files
          file-changed-only: false
          # ignore just the demo.cpp file in the demo folder (demo.hpp is still checked)
          ignore: "demo/demo.cpp"

Using log output commands

I came across a special command in the github action docs that basically interprets this

print("::group::Section Name")
print("some descriptive output")
print("::endgroup::")

and renders it in the action's logs like so

Section Name
some descrptive output

Amazing discovery

After looking into the other available action log commands, I found a way to directly annotate the checked source files with this action's output. This should look like the notification that the code-inspector CI outputs.

Setting an error message
Setting a warning message
Setting a notice message
Setting a debug message This one can be used just for our testing PRs!

With these discoveries, posting comments in the file's diff (or even in files that aren't changed) should be much less problematic than going through the REST API! 🥇

shenxianpeng commented 3 years ago

Great discovery, log output command looks really good 💯

2bndy5 commented 3 years ago

I'm close to submitting a PR, but I have to refine handling the new ignore option. There may be a bug concerning the root folder of the repo. And, I'd like to add a possible syntax that allows excluding an entire folder except for a specific file/folder within the excluded folder.

My current idea:

      - uses: actions/checkout@v2
      - uses: shenxianpeng/cpp-linter-action@master
        id: linter
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        with:
          style: file
          # search entire repo for source files
          file-changed-only: false
          # ignore everything in demo folder except the demo.cpp file. Notice the bang (`!`) prefix
          ignore: |
            # yml comments are not recognized here
            demo 
            !demo/demo.cpp

You can find my progress in this latest run

shenxianpeng commented 3 years ago

Your idea and latest run look good to me 👍

2bndy5 commented 3 years ago

I'm trying to reference

"--files-changed-only=false" "--ignore=python_action
*docs   <-- might be malforming the log (there's a '\n' in the `ignore` option's string value)
"       <-- end of ignore parameter's value
processing push event
...

~I found a way to use JSON data within action.yml. I'm hoping this will solve both the specific inclusion of a path/file and prevent \n from creeping into the ignore parameter's string value. More updates to come...~ All input options are treated as strings when they're passed to the python script. I had to change my initial implementation because of this fact.

2bndy5 commented 3 years ago

I tried out using the new log commands to implement in-file comments, but I found some differences compared to using the REST API to post comments.

The message's body can't be multiple lines.
The message looks different in the file's diff compared to the message in the workflow's summary.

However, despite these differences, it is much easier (and even faster) to use log commands.

What you think?

Do you want to keep the use of log commands to make comments?
Do you want to keep the use of REST API to make comments?
Do you want to use both log commands and REST API to make comments?

shenxianpeng commented 3 years ago

Making decisions is difficult. I prefer the last one to have both.

I have seen a few Lint and other Actions with many starts. Most of them support log commands, and few use REST API to add comments.

It seems super-linter only support log commands

https://github.com/github/super-linter/actions/runs/325756967

codecov-action has both

For log commands: the retention period of the log files is between 1 day or 90 days for public repositories,

If there are both, it might be better to have a switch to control log command and REST API comment on or off. From a maintenance point of view, only the command will be easy and fast, so I respect your choice if you only choose the log commands.

2bndy5 commented 3 years ago

oh I forgot about the retention period! We shall have both then. PR coming soon...

2bndy5 commented 3 years ago

The link to super-linter example is a year old, but the annotations still exist 👍🏼

If there are both, it might be better to have a switch to control log command and REST API comment on or off.

This thought had occured to me also. How about a new option called thread-comment? If it set true, comment will appear in the thread (PR or commit). Otherwise no comment in the thread.

I also think we should switch away from using a table in the readme to document the input options. They are getting numerous, and the ignore option needs its own list to fully explain the capabilities.

shenxianpeng commented 3 years ago

Do you mean to call https://github.com/2bndy5/cpp-linter-action/commit/7ce1dd61bf0c5c33a2e18e2c0ee8df75695d9eb1 as thread-comment? not sure if it is a commit comment, my understand below is a thread comment.

If there are other better ways to show input options, please go ahead. Btw, I still like using readme to display the basic usage.

2bndy5 commented 3 years ago

I mean the comments that the action currently uses thread comments. I thought to call them thread comments instead of having "commit comments" and "PR comments" options. You are correct; the picture you posted are thread comments.

The output created from the log commands are called (by github) "annotations". Its a pretty word for saying "notes with various priorities".

The readme can stay as is, but instead of using a table to show all input options, I want to use nested & unordered lists. Trust me the information for the ignore option will be hard to read in the cell of a table.

shenxianpeng commented 3 years ago

Close it as it has been resolved in #26 👍

cpp-linter / cpp-linter-action