Closed gro1m closed 1 year ago
Can you update from master? I enabled Github Actions although we don't have the coverage pipeline working for now, will have to do that subsequently. See #139.
Can you update from master? I enabled Github Actions although we don't have the coverage pipeline working for now, will have to do that subsequently. See #139.
Yes, done right now and at least the checks seem to be back again and I think they should fail :)
@aconrad The tests fail, because the Main class is shown twice. Indeed, the Main class occurs twice in cobertura.xml, so is the test actually correct and should we modify the assertions. Could you please check?
Hm, this cobertura.xml
file was taken from the internet somewhere, I didn't generate it – I was looking for a sample cobertura file to use for testing when I started pycobertura. But I did a git blame on it and noticed that this file was tampered with in 1fb4d0b1 (PR https://github.com/aconrad/pycobertura/pull/82), was a bug introduced there?
Hi @aconrad Thanks for checking! I think for now it would be best to remove these lines (maybe you could open a separate PR so that we have this well tracked). And in future it might be good if we could generate these files ourselves, but I think this would be an enhancement that we might not want to do right now... What do you think?
I think for now it would be best to remove these lines (maybe you could open a separate PR so that we have this well tracked).
I'll have to take a closer look at why we have a filename that doesn't look like a path, I'm not familiar with this syntax. Is it a Java thing? We can follow up in the original PR and ask the author.
And in future it might be good if we could generate these files ourselves, but I think this would be an enhancement that we might not want to do right now...
I didn't want to generate my own to ensure that pycobertura wasn't biased parsing a non standard file format (e.g. as you noticed, even the coveragepy-generated files are broken by omitting the directory in the filename attributes). The original coverage schema standard is not very precise in describing what the value of each attribute should be.
What do you think?
I would encourage introducing files generated by other languages. But if we were to generate our own, how would we go about it? Do you mean the ones generated by the pycobertura project? What value would it bring vs the ones we already have? Some of the sample coverage files in the tests directory were generated ourselves.
I think for now it would be best to remove these lines (maybe you could open a separate PR so that we have this well tracked).
I'll have to take a closer look at why we have a filename that doesn't look like a path, I'm not familiar with this syntax. Is it a Java thing? We can follow up in the original PR and ask the author.
In the end, I need your decision on this: either we count double and leave the file as is or we exclude these lines. As it stands now, the tests will have to fail... For me there would be less confusion leaving them out, but in the end it does not matter much...
As a side-note: Also this file contains a method section, so I am a bit surprised about this issue: https://github.com/aconrad/pycobertura/issues/118.
What do you think?
I would encourage introducing files generated by other languages. But if we were to generate our own, how would we go about it? Do you mean the ones generated by the pycobertura project? What value would it bring vs the ones we already have? Some of the sample coverage files in the tests directory were generated ourselves.
Actually we would need to test some popular coverage report generators always on the same code logic but in different languages. Yes, I agree this would be challenging and I currently have no idea on how to go about this, but maybe someone else from the community could support here. Also the code logic would have to be small, but still relevant. The only reason I brought it up is that if the truth lies in the project's code, it is much easier to understand the code and why certain changes have evolved over time.
@aconrad I think I still need your decision here about the cobertura.xml
, i.e. correcting the tests to account for the duplicate Main.java
or removing the corresponding line from the cobertura.xml
...
The method Cobertura.files()
was designed to deduplicate filenames if they appeared multiple times and we should retain that behavior since that seems to be a valid use case (the motivation for #82).
There can be multiple classes in the same filename, and since the <class>
element is meant to represent a class object, it's possible to find a different class within the same file. It makes sense to me.
The method
Cobertura.files()
was designed to deduplicate filenames if they appeared multiple times and we should retain that behavior since that seems to be a valid use case (the motivation for #82).There can be multiple classes in the same filename, and since the
<class>
element is meant to represent a class object, it's possible to find a different class within the same file. It makes sense to me.
Ok, but should we then not include all of them, i.e. does it not work against https://github.com/aconrad/pycobertura/pull/132.
For now, I reverted the logic so that this PR should be good to review, but I am not sure I quite understood the above...
but I am not sure I quite understood the above...
My understanding is that if we have a file foods.py
, this file could have class Apple: ...
and class Banana: ...
. The coverage tool could generate the following report:
<coverage>
<packages>
<package ...>
<classes>
<class name="Apple" filename="foods.py" ...>
<methods>...</methods>
</class>
<class name="Banana" filename="foods.py" ...>
<methods>...</methods>
</class>
</classes>
</package>
</packages>
</coverage>
When we call Cobertura.files()
, the expectation is that ["foods.py"]
would be returned, not ["foods.py", "foods.py"]
. This is why #82 was implemented.
Does it add clarity?
Hm. Let me take a closer look at #132, I haven't looked into it yet.
Hi @aconrad
After thinking more about it now, I understood that the current implementation is indeed correct, because we measure the stats on a file-based level. The only question one might ask is why are we parsing on \\class
and not on something like filename
or file path
(because there could be 2 files with same name in different directories), as we have file-based stats and not class-based stats: is this not possible within lxml
? And also one could discuss if we should offer different views, e.g. directory level, file level, class level, method level, maybe even a tree. But I think the discussion now diverges from what the PR additionally wanted to address and should probably take place in a different issue, if you think it could be worth it to discuss this further.
The only question one might ask is why are we parsing on
\\class
and not on something likefilename
orfile path
(because there could be 2 files with same name in different directories), as we have file-based stats and not class-based stats: is this not possible withinlxml
?
Great question. When I intially implemented pycobertura, I used coverage reports generated by coveragepy as my sample coverage files. And we can tell from the report that 1 class == 1 file. Here's a snippet coverage from tests/dummy.original.xml
:
<class branch-rate="0" complexity="0" filename="dummy/__init__.py" line-rate="0" name="dummy/__init__">
<methods/>
<lines/>
</class>
<class branch-rate="0" complexity="0" filename="dummy/dummy.py" line-rate="0.5" name="dummy/dummy">
<methods/>
<lines>
<line hits="1" number="1"/>
<line hits="0" number="2"/>
<line hits="1" number="4"/>
<line hits="0" number="5"/>
</lines>
</class>
So that's why the implementation is like that.
But over time, people have reported issues because their coverage files (generated by other tools/languages) aren't properly parsed by pycobertura (e.g. #82). Pycobertura is meant to be a Cobertura file parser and be language agnostic.
And also one could discuss if we should offer different views, e.g. directory level, file level, class level, method level, maybe even a tree. But I think the discussion now diverges from what the PR additionally wanted to address and should probably take place in a different issue, if you think it could be worth it to discuss this further.
I agree.
This looks good, thanks! Can we update CHANGES file? Also, should we add a note somewhere in the README about this new option?
This looks good, thanks! Can we update CHANGES file? Also, should we add a note somewhere in the README about this new option?
done, can you please check again :)
Adds a regex option.
Examples:
Did not test diff yet, there are also no tests yet...
Just wanted to get your opinion on it, @aconrad. I tried to keep the logic simple and here it is for including the files you want but you could also write the regex as such as to exclude the files you do not want. It is a bit complicated, but it works:
The single quotes are absolutely necessary, otherwise one gets: