Open fernandobrito opened 3 years ago
Ahaha, as a side note, I just spent like 30 minutes trying to figure out how to run (on my fork):
diff-quality --violations sqlfluff --diff-options "--relative"
Python argparse
doesn't like this syntax. It will try to parse my last argument as a command and return:
usage: diff-quality [-h] --violations TOOL [--html-report FILENAME]
[--external-css-file FILENAME] [--compare-branch BRANCH]
[--options [OPTIONS]] [--diff-options [DIFF_OPTIONS]]
[--fail-under SCORE] [--ignore-staged] [--ignore-unstaged]
[--exclude EXCLUDE [EXCLUDE ...]]
[--diff-range-notation RANGE_NOTATION] [--version]
[--ignore-whitespace]
[input_reports [input_reports ...]]
diff-quality: error: unrecognized arguments: --relative
The solution is to run as:
diff-quality --violations sqlfluff --diff-options="--relative"
As found in a comment on this question: https://stackoverflow.com/questions/45541581/using-argparse-with-argument-values-that-begin-with-a-dash
@fernandobrito if I understand this fully (and ill admit I can be a but slow sometimes :-P) just passing the absolute paths to quality tools will fix it (solution 1)
I don't believe we need an extra arg here because absolute paths and relative paths should be backwards compatible.
can you do me a favor and experiment with this branch and let me know if it helps you out?
Hi, @Bachmann1234, thanks a lot for your quick response and proposed solution!
I'm just testing your branch now and I realize two things:
1) The change you did doesn't get executed in my particular case because the tool that I use (sqlfluff
) registers itself as a plugin inheriting from BaseViolationReporter
, and your changes are in QualityReporter
. Here is the code for their plugin: https://github.com/sqlfluff/sqlfluff/blob/master/src/sqlfluff/diff_quality_plugin.py. I guess that to make it work for all plugins, the path should be modified before calling BaseViolationReporter
implementations.
2) But anyway, I could just take your changes and apply on their plugin locally to test it, and here are the results:
Assume my git root is in: /Users/Fernando/my_repository/
I want/have to run diff-quality
from: /Users/Fernando/my_repository/project_1/
As a side note, if I run git diff
from /Users/Fernando/my_repository/project_1/
, I get the path from my changes relative to the git root:
project_1/file_1.sql
project_1/file_2.sql
project_2/file_3.sql
(I will get this even if I'm calling git diff
from project_1
)So, before your changes, src_path
is for example project_1/file_1.sql
, which is bad, because I'm already inside project_1/
, so the tool will look for project_1/project_1/file_1.sql
.
With your changes, src_path
will be /Users/Fernando/my_repository/project_1/project_1/file_1.sql
, which is also incorrect. From what I understand, os.path.abspath(path)
will just take the current directory path (/Users/Fernando/my_repository/project_1/
) and join with the argument (project_1/file_1.sql
), so that doesn't help. The correct way of getting the absolute path for the file would be to join src_path
with the git root folder, and not to the current directory. Do you have already an easy way to get the git root folder?
As a side note, yesterday I also experimented with my other approach (--diff-options
argument), which worked for me. My changes (without tests yet) can be seen here: https://github.com/Bachmann1234/diff_cover/compare/master...fernandobrito:add-diff-options-argument. And then I can call:
diff-quality --violations=sqlfluff --diff-options="--relative"
If we can make it work without any extra arguments, I don't mind either 😄 . But for my case, if we just convert relative paths to absolute paths, I would also suggest a --include
argument to select which files should be passed to the tool (unless there's a way of using the existing --exclude
for this in a way that I don't understand).
This is because If I'm calling diff-quality
from /Users/Fernando/my_repository/project_1/
, I don't want to pass changed files from /Users/Fernando/my_repository/project_2/
to the linting tool. The nice thing about my solution above is that I get this behavior for free (only listing changes in my current folder) when using git diff --relative
.
I always end up writing too much, so let me know if it doesn't make sense and I can try to be more concise 😅
I think if abspath worked like you said all my tests would have exploded.
I could be wrong here but I don't think abspath joins at all so much as converts the relative path to an absolute one by tracing backwards to the root.
Ultimately I'm not completely against adding the arg (assuming tests and maybe a bit to the readme gets added)
Just if we could get this working without adding more arguments that seems cleaner as the diff parsing can be a bit fragile and I worry that a lot of attempts to do anything non trivial with this arg will have problems.
I agree that it will be much simpler if there's no new argument (and users can also do a --include
like I mentioned or if --exclude
can support negations), but I think we need to get more creative with the paths.
abspath
documentation states:
[...]. On most platforms, this is equivalent to calling the function normpath() as follows:
normpath(join(os.getcwd(), path))
.
https://docs.python.org/3/library/os.path.html#os.path.abspath
Also testing locally:
$ pwd  ✔  02:18 
> /tmp/git_root/project_1
$ python
>>> import os
>>> os.path.abspath('project_1/file_1.sql')
'/tmp/git_root/project_1/project_1/file_1.sql'
(note the double project_1
)
If we could make git diff
always return absolute path, then that could be added as a default internal argument on GitDiff
. Or there are git commands such as git rev-parse --show-toplevel
that will return the path to the git root, and then that could be joined with the existing path that we already get from git diff
.
🤔 gotcha.
Yeah, I think I like using rev-parse
to get the git root. I dont ... think .... you can get the absolute path from git diff
.
but ive been wrong once in this convo already!
I can try a PR if you are busy.
Update: ah wait, I was looking for a good place to put such call to rev-parse
but it's already there in the codebase 😮 . https://github.com/Bachmann1234/diff_cover/blob/master/diff_cover/git_path.py#L66. I guess I just have to use it before we pass the path to the tool.
Hi @Bachmann1234, sorry for the absence, but now I'm back 👋 .
For our use case in my company (running diff-quality
on a subfolder on a mono-repository) we need both (1) to have absolute paths provided to the linting tool but also (2) that only the modified files within that subfolder are passed to the linting tool.
I just saw that one month after my last comment in this issue, @kasium contributed with a --include
feature (thanks! 🙌 ). Later this month I will try to implement the absolute path missing part and give an update here.
I gave it a quick try this weekend and it looks more complicated than I thought 😅.
My goal was to have either the absolute path or the relative path (relative to the current folder, and not relative to the git root) passed to the drivers, but then I realized that I might need to convert the paths back to being git root relative after parsing the command results and before passing it back to the report generator.
I don't think I will really have time to go deeper into this, and since no one else upvoted this issue or commented on it, maybe my needs are too specific and I should just close the issue?
Something I did notice is that when calling diff-quality
from subfolders, the results are a bit misleading in my opinion since files are not processed but reported as 100%.
Example:
violation = f""
in diff_cover/violationsreporters/base.py
diff-quality --violations=flake8
, I get 1 violation, as expectedcd diff_cover
and run the same, I get diff_cover/violationsreporters/base.py (100%)
.The file was not processed/passed to flake8
because this line (https://github.com/Bachmann1234/diff_cover/blob/main/diff_cover/violationsreporters/base.py#L151) is using the git root relative path (diff_cover/violationsreporters/base.py
in this case) and checking if it exists based on the current directory, but since I'm not in the git root, but in a subfolder, this if
fails, but on the final report the file is shown as 100%. Should it maybe log a warning or somehow give a sign to the user that this file was never processed? Or maybe it's already obvious enough that we should always run diff-quality
from the git root?
@fernandobrito maybe the work in https://github.com/Bachmann1234/diff_cover/commit/dfd593605a8990d6f930bac1a65f24b42d7d3eca can also help?
Hello! This is my first attempt to use
diff_cover
, so please let me know if I'm missing something obvious 😄 .To give more context, I want to use
diff_cover
together with the sqlfluff linter. They register themselves as a plugin and it seems to work very well on usual setups: https://docs.sqlfluff.com/en/stable/production.html#adding-diff-quality-to-your-builds. Amazing to see this kind of integration between different open source projects! Great feature being provided bydiff_cover
, thanks!In my case, however, I use a "monorepo" repository architecture and I have my project that
sqlfluff
should run against on a subfolder. Something like:git root:
/Users/Fernando/my_repository/
project where I want to rundiff-quality
:/Users/Fernando/my_repository/project_1/
If I run
diff-quality
from insideproject_1/
, I will get file paths back fromgit diff
relative tomy_repository/
(git repository root), for example:project_1/file_1.sql
, but I'm already inside theproject_1/
folder. Passing such relative paths to the linting tools won't work. For this particular case, withsqlfluff
, I also can't run it from outsideproject_1
due to a limitation on their side, but I thinkdiff_cover
should be generic enough (see some alternatives below) to handle it being called from sub-folders.Alternative 1: absolute paths
My problem would be solved if when
diff-quality
passes file paths to the linting tools, it would pass absolute paths.sqlfluff
would be able to find the target files correctly in this case. Maybe have an extra argument ondiff-quality
for this?Alternative 2: relative path to where
diff-quality
is being calledAnother possible solution would be if
diff-quality
would send file paths relative to where it is being called, and not to the repository root folder. It turns out thatgit diff
accepts a--relative
argument which does exactly that.One very convenient benefit of this approach is that using
--relative
will also limit the scope ofgit diff
to only show modified files in the folder where it is being run. My alternative 1 below would pass modified files to the linting tools even if they don't belong toproject_1
, which is not what I want when I call it from insideproject_1
.Some options of implementing alternative 2:
--relative
a default argument ongit diff
. That would have no visible effects ifdiff-quality
is called from the root, but people with custom workflows calling it from subfolders and doing things themselves might be affected;--git-diff-options
indiff-quality
, so I can calldiff-quality --diff-options "--relative"
;--relative
CLI argument.Does it make sense? Is there another way to solve my issue?