DaveLiddament / sarb

Static Analysis Results Baseliner
MIT License
162 stars 17 forks source link

Should remove auto update the baseline ? #131

Closed d-olim closed 9 months ago

d-olim commented 1 year ago

Thanks for this amazing project.

I’m testing it with phpStan on a large codebase and I don't fully get if using it with remove command should also automatically update the baseline file (that's what I expected, am I wrong?)

So I'd have:

Latest analysis issue count: 99 Baseline issue count: 100 Issue count with baseline removed: 0

But the phpstan.baseline file is not updated automatically.

So I am left with having to recreate the baseline after the fix?

I wonder if a way to detect what was fixed and update the baseline accordingly is needed to streamline dev experience.

One example is when I fix multiple things on the same file and by moving lines I have now lost the exact position of the next error in the same file. Unless I don’t update the baseline by using the create command I can’t see the exact line position of the next error I want to fix belonging to the same file I’m working on.

The workflow I thought was implemented by reading the docs is:

DaveLiddament commented 1 year ago

Thanks for the question.

I think the problem is my terminology. In the context of running SARB, remove means remove the issues in the baseline from the list of issues reported by the static analysis tool (in your base phpstan). The remove option does not actually alter the generated baseline.

I think I need to update the docs to make this clearer.

The intended workflow is:

  1. Create the baseline.
  2. Check in the generated baseline
  3. Edit code
  4. Run static analysis tool and SARB (in the remove mode). E.g. vendor/bin/phpstan analyse --error-format=json | vendor/bin/sarb remove phpstan.baseline
  5. Fix any issues introduced since the baseline (but leave the baseline file untouched)
  6. Repeat steps 4 to 6.

Does that make sense? Let me know if you have any more questions.

The other thing to note is that PHPStan has their own, slightly different baseline tool, see: https://phpstan.org/user-guide/baseline

SARB was written before PHPStan had a baseline function. There are differences and why you might want to use one tool over another are explained here: https://github.com/DaveLiddament/sarb/blob/master/docs/SarbVsOtherBaseliningTechniques.md

Let me know if the above makes sense, or if you have any questions.

d-olim commented 1 year ago

Thanks Dave. Everything you said makes sense indeed and I did read the phpstan docs plus the differences with SARB. The ability of tracking renamed and moving files in fact is exactly what I was looking for.

The slightly unexpected behaviour, but from what you are saying it wasn’t done by design, is that the remove option doesn’t automatically updates the baseline file by removing the fixed issue or updating the information about previously tracked ones. Let me explain better what I mean

1) remove case I think in some occasions as described in my comment above it will be quite helpful if the tool could actually remove fixed issues from the baseline. Although some could argue that it is the same as re-creating the baseline I think there’s a performance benefit (especially on large codebases) to remove the exact item that was fixed from the baseline file while moving remove. Do you see any problem with that? I’m curious why that wasn’t the default choice for this project.

2) update case The other good occasion I could think as the right moment to programmatically update the baseline file while running remove is in the exact case of renaming files or moving lines (or both). At that moment if we haven’t fixed the problem yet we still want our baseline evolve with our code (eg. Issues belonging to Person.php at line 10 and 11 have now to be reported in the baseline file as belonging to Employee.php 12 and 13 - just as an example).

I think that would make this a joy to work with, unless I’m unable to see some fundamental problems which made you already discard all the above 😀

I guess the only options I have right now is update the baseline file either manually or by recreating it when is needed.

If I understand well it seems that this isn’t in the original design so I think you can close this as an issue and maybe this is a feature to be considered for future releases?

Thanks

d-olim commented 10 months ago

Hello me again :) Regarding this remove functionality.

I have 2 issues:

1) changes between runs

I kept using it normally but I noticed that sometimes the output changes from run to run (even without modifying files between runs)

Eg.

First run:

Latest analysis issue count: 22574 Baseline issue count: 21901 Issue count with baseline removed: 1408

Second run:

Latest analysis issue count: 22314 Baseline issue count: 21901 Issue count with baseline removed: 1034

If I keep trying it keeps being different - only initial baseline count is the same

Can you think at anything that causes the above? Maybe some glitch during git analysis? - as you can see is quite large codebase and it takes minutes to run.

2) Issue count with baseline removed

I also still struggle to understand the relation between “ Issue count with baseline removed” and others. Shouldn’t one expect that to be “New issues since baseline” and so at worst the difference between the latest analysis issues minus the baseline issues? How one should interpret a number major of that difference ?

DaveLiddament commented 9 months ago

Looks like something is wrong with SARB.

Please can you copy the commands and paste the commands you are running.

Also can you let me know what version of git is on your system.

Thanks

d-olim commented 9 months ago

Thank you:

Cli: bin/phpstan analyse -l 9 src/ --error-format=json --memory-limit=-1 | bin/sarb remove phpstan.baseline

git version: git version 2.36.6

But by adding --memory-limit=-1 in the command I was able to have consistency between runs

First run:

Latest analysis issue count: 22705
Baseline issue count: 21901
Issue count with baseline removed: 1404

Second run:

Latest analysis issue count: 22705
Baseline issue count: 21901
Issue count with baseline removed: 1404

Third run:

Latest analysis issue count: 22705
Baseline issue count: 21901
Issue count with baseline removed: 1404

So the problem might be that phpstan was exiting before finishing the analysis. Without SARB reporting back the memory error from phpstan. Probably adding in the Readme or making sarb detect the error would help.

That said I still struggle to interpret the report summary.

Shouldn't Issue count with baseline removed be: 22705 - 21901 = 804 instead of 1404 ? Could you help clarifying how to interpret that ?

Thanks

DaveLiddament commented 9 months ago

Good point, I should pass through errors from PHPStan and make SARB exit with an error code if PHPStan finds an error. I've created an issue for this.

As for the numbers...

The number not reported (because it is not calculated) is the number of issues fixed since the baseline was created. Looking at the numbers, it looks like 600 issues have been fixed since the baseline was created.

Does this make sense?

d-olim commented 9 months ago

Ok I see so fixed: 22705-1404 - baseline (21901) = -600 so 600 fixed but 1404 new issues

Probably reporting the Fixed since baseline in the output can help

Thanks for opening https://github.com/DaveLiddament/sarb/issues/135 will close this now