Open nedbat opened 7 years ago
Original comment by Suriya Subramanian (Bitbucket: suriya, GitHub: suriya)
Here's a public git repository demonstrating this bug: https://bitbucket.org/suriya/coverage-xml-bug
You can view the result of a run here: https://bitbucket.org/suriya/coverage-xml-bug/addon/pipelines/home#!/results/1
@suriya Can you provide me with a reproducible case?
Original comment by Suriya Subramanian (Bitbucket: suriya, GitHub: suriya)
I am using coverage.py version 4.3.4. I see that only one file with a particular base name at a particular depth is reported. For example, with three files a/__init__.py
, b/__init__.py
, and a/c/__init__.py
, we get only two files in the output a/__init__.py
and a/c/__init__.py
. This is especially problematic with a Django project and we get a report for only one models.py
file.
Original comment by Alexandre Conrad (Bitbucket: aconrad, GitHub: aconrad)
Also, pycobertura is affected by this change too.
There should be a way to have a relative path to the source files when starting from the root of the package and we lost this information. Currently the <source>
tag only provides absolute paths to the source on the machine it was generated on which isn't very useful if the coverage report gets interpreted on a different machine where the absolute path is unlikely to exist.
For example, if a machine receives a coverage report that was generated for package "foo" at commit "abc123" with a filename __init__.py
(which was originally foo/__init__.py
) then we should be able to git-clone the source of package foo@abc123 and reconcile the paths somehow. But with the change there is no way to do this anymore.
filename="__init__.py"
name="."
We're missing the information about the directory foo
.
Original comment by Alex Gaynor (Bitbucket: alex_gaynor, GitHub: Unknown)
I'm not particularly convinced this change is correct, but looking at: https://bitbucket.org/ned/coveragepy/src/63dfe482a849e9cb0b82214242315a806a6a3d53/coverage/xmlreport.py?at=default&fileviewer=file-view-default#xmlreport.py-143:151
Our challenge is that we have two paths in self.source_paths
: /path/to/cryptography
and /path/to/cryptographt/tests
. Since it's a set, depending on your version of Python, ordering may not be stable, if /path/to/cryptography/tests
comes first, rel_name
ends up as test_x509.py
, if /path/to/cryptography
comes first you get rel_name = "tests/test_x509.py"
.
The fix, which I'm not particularly confident in, is to change line 146 to be: for source_path in sorted(self.source_paths, key=len):
.
This will ensure we always get /path/to/cryptography/
first here. Tested locally and it works, but I have no sense of whether this is conceptually correct :-)
Original comment by Alex Gaynor (Bitbucket: alex_gaynor, GitHub: Unknown)
This is hitting us on pyca/cryptography
. Can be reproduced with:
$ git clone https://github.com/pyca/cryptography
$ cd cryptography
$ tox -e py34 -- tests/test_x509.py
$ coverage xml -i
As you can see in the tox
output's coverage report
, there paths in our test directory have a test/
prefix, but after running coverage xml
, in coverage.xml
, files from the test/
directory do not. An easy file to see this with is test_x509.py
.
Original comment by Dirk Thomas (Bitbucket: dirk-thomas, GitHub: dirk-thomas)
In order to avoid the ambiguity (while still keep the information of the sources basepath and the filename within that basepath separate) could the class
tag get a new attribute source
to clearly identify the base path the referenced file was being found? Obviously that would require downstream tools to be updated too but I don't see a good solution which works in all cases based on the information present in the current xml.
Original comment by Bence Nagy (Bitbucket: underyx, GitHub: underyx)
FWIW the SonarQube Python Plugin's coverage importer is also utterly confused by these changes.
Original comment by David Multer (Bitbucket: dmulter, GitHub: dmulter)
I've confirmed that files in different modules with the same name (i.e. init.py) never did show in my coverage output and I just failed to notice that I had no data one those files. Getting that fixed is obviously important, but less so than making sure my XML processing tools are able to parse the output properly.
Original comment by David Multer (Bitbucket: dmulter, GitHub: dmulter)
Yes, I see that though I never noticed the problem before. I'll have to think a bit to understand why. So fixing that problem would probably be important, but if it still existed and the code went back I would think all was well ;-)
It's the second problem that I felt was the real issue. I'm waiting to see what the coverage2clover person says...
Just so we're clear: my point is about the old behavior in coverage.py 4.3.4: it didn't include all of the files in the XML report if the files had same names in different source trees. So just putting back the old behavior won't be enough. Maybe we agree on this, I'm not sure?
Original comment by David Multer (Bitbucket: dmulter, GitHub: dmulter)
Yes, coverage1.xml shows the single init.py file problem, and the coverage2.xml files shows just the filespec problem tools will have knowing where foo.py or bar.py are. I have the coverage2clover maintainer looking at these too.
Yes, thanks. When I run this code with coverage 4.3.4, two/__init__.py
is in the coverage1.xml file, but one/__init__.py
is not. Even if I add an executable line to each of those files, only one of them appears in the results. There's more to be fixed here.
Also, the __init__.py
files are still empty?
Thanks, but it looks like .coveragerc_noinitfiles is missing.
Original comment by David Multer (Bitbucket: dmulter, GitHub: dmulter)
No, I think that's a side effect of not having any actual code to test in my example. You could easily just add a do-nothing function in each to get file output for both. You'll see the same problem. I think I'll go add it to the repo now...
@dmulter thanks for that. I'm trying to understand the old behavior you want back. When I run your code under coverage==4.3.4, only two/__init__.py
is in the XML output. one
is missing completely. So the old code had some problems with this multiple-source scenario also. Is that true of your real code also?
Original comment by David Multer (Bitbucket: dmulter, GitHub: dmulter)
I also created an issue for this in the coverage2clover repo - https://github.com/tumb1er/coverage2clover/issues/10
I want to make tools work, but I want all of them to work. I'm not going to simply revert the change. I'll make an improvement that will help everyone. It would be a big help to me if someone could provide a reproducible case of a multi-source result. Even better would be to advocate with them to make use of all of the information in the coverage.xml file.
Original comment by Benjamin Morgan (Bitbucket: bj_morgan, GitHub: Unknown)
This change from 4.3.4 → 4.4 has also broken compatibility with the codeclimate test reporter. See https://github.com/codeclimate/python-test-reporter/issues/40
@dmulter @postrational Can you provide me with a reproducible example, even if it's a link to an online repo with instructions of how to run the tests? I want to get this to work again for you. It's clear to me that combining the source path and the filename should be a real file path, so I think the fix was the correct one, but clearly I am not done since your use cases are broken.
Original comment by David Multer (Bitbucket: dmulter, GitHub: dmulter)
No idea on how source paths is used, but the coverage2clover tool seems to just do an open() on the filename= value, so it must go back to the way it was previously for it to work again. Note that my coverage run produces multiple source paths as I use a shared repo for common code.
#!xml
<sources>
<source>/Users/xxx/Documents/xxx/app-common/apis</source>
<source>/Users/xxx/Documents/xxx/app-foo</source>
<source>/Users/xxx/Documents/xxx/app-foo/apis</source>
</sources>
and
#!xml
<class branch-rate="0" complexity="0" filename="apis/__init__.py" line-rate="1" name="__init__.py">
Yes, I see the problem, sorry. What is the correct behavior here? This was to fix https://bitbucket.org/ned/coveragepy/issues/526/generated-xml-invalid-paths-for-cobertura. How are the source paths and the filenames supposed to be combined?
Original comment by Michal Karzynski (Bitbucket: postrational, GitHub: postrational)
If we have 2 files like this:
/path/to/directory_one/__init__.py
/path/to/directory_two/__init__.py
Then they would both get a like like this:
<class branch-rate="0" complexity="0" filename="__init__.py" line-rate="1" name="__init__.py">
How can I tell which file this refers to?
This was a bug that was fixed. The filename attribute should be combinable with a source entry to find the file. Can you tell me more about why you need the directory name in both places?
@nedbat I ran across this issue and believe I've found a useful workaround.
Changing from specifying the modules in [sources]
to directory paths in [include]
ensures that the path to each file is output in coverage.xml
:
using [source]:
<class branch-rate="0" complexity="0" filename="hello.py" line-rate="1" name="hello.py">
<class branch-rate="0" complexity="0" filename="baz/hello.py" line-rate="1" name="hello.py">
using [include]:
<class branch-rate="0" complexity="0" filename="bar/hello.py" line-rate="1" name="hello.py">
<class branch-rate="0" complexity="0" filename="bar/baz/hello.py" line-rate="1" name="hello.py">
<class branch-rate="0" complexity="0" filename="foo/hello.py" line-rate="1" name="hello.py">
I'm not sure how this might need to be worded in the project's documentation, but will try and take a stab at it if you think that'd be useful.
@jacobwegner Thank you for the workaround.
Just for reference, this was changed in 80cf759 (to fix https://github.com/nedbat/coveragepy/issues/526).
What does the spec / XML format say in this regard? I could not find this quickly, but came across an example, where it contains the command being used even?! https://raw.githubusercontent.com/jenkinsci/cobertura-plugin/master/src/test/resources/hudson/plugins/cobertura/coverage-with-data.xml
After all it is impossible to find the correct file if there are multiple sources, and it causes problems all around. From #526 it seems like there was an issue with Jenkins Cobertura plugin, maybe only in combination with nose initially?
(I know that the workaround is to use a single source only, but it is rather unfortunate running into this still)
I spent the weekend going crazy with pytest-cov and Codecov.
I can confirm that Codecov walks the filename=
tags and without a path in them there is not way for it to match the coverage to a file.
FYI, the generated coverage.xml
file says it uses https://raw.githubusercontent.com/cobertura/web/master/htdocs/xml/coverage-04.dtd, if that helps any, @blueyed (It's very bare bones)
I'd like to bump this thread again since I'm still seeing the issue with coverage v6.
I posted about it in this other project https://github.com/aconrad/pycobertura/issues/133#issuecomment-1036466783 but I will copy-paste here for convenience:
What I am noticing is that in coverage v4 the
class
element has a "filename" attribute as such:<class name="cli.py" filename="pycobertura/cli.py" ... >
In coverage v6, we now have the directory that is missing from the filename:
<class name="cli.py" filename="cli.py" ... >
And causes Pycobertura to fail with a
FileNotFound
:File "pycobertura/filesystem.py", line 73, in open raise self.FileNotFound(filename) FileNotFound: aconrad-pycobertura-193cfa7/__init__.py
The file should be
aconrad-pycobertura-193cfa7/pycobertura/__init__.py
(don't worry about the prefix directory, that's when we use pycobertura with a zip file and the --prefix flag).I'm surprised this has never been fixed. I wonder how code coverage parsers do it. Do they compose the filename by taking the last segment of the path inside the
<sources>
element? In the report generated by coverage v6 we have source<source>/Users/alexandre/Development/pycobertura-gro1m/pycobertura</source>
– but given that there could be multiple sources, it could be ambiguous.Thoughts?
Originally reported by Anonymous
Hi there, we are relying on the XML coverage report generated in
coverage.xml
for our analysis.In previous versions of coverage, when we run the command such as:
$ py.test --cov directory_one --cov directory_two --cov-report xml tests
We would get a
coverage.xml
file withsources
listed as:And each file would be reported in a line such as this:
Now in coverage 4.4 the first level of the directory path is missing and we get only:
Is there any way to get the previous behaviour?