Unstable order of HTML attributes inside the LCOV report

jdehaan commented 1 year ago

Inside the generated html reports, the order of the html attributes is not guaranteed to be always the same:

             <td width="5%"></td>
-            <td class="headerCovTableHead" width="5%">Coverage</td>
-            <td title="Covered + Uncovered code" width="5%" class="headerCovTableHead">Total</td>
-            <td width="5%" class="headerCovTableHead" title="Exercised code only">Hit</td>
+            <td width="5%" class="headerCovTableHead">Coverage</td>
+            <td width="5%" title="Covered + Uncovered code" class="headerCovTableHead">Total</td>
+            <td width="5%" title="Exercised code only" class="headerCovTableHead">Hit</td>
           </tr>

I guess this has to do with the internals of storage of hashes somewhere deep in the framework. This is of course not a very big issue, but in our use case we maintain snapshots of the reports in source code repositories and this generates lots of delta for nothing...

I could not find a specific place in the code to fix that. I believe sorting the attributes has to be added explicitly somewhere to solve this.

henry2cox commented 1 year ago

Near line genhtml:7483, in function 'write_header_line'. We walk an attribute hash via 'each' - which is fast, but takes advantage of internal hash layout so is not guaranteed to be any particular order.

It isn't hard to change - but I don't think I understand why it would be necessary. I guess you are screen-scraping the HTML pages?

I think that a far easier way to get what you want is via the genhtml --criteria-script ... callback. genhtml will call you with the summary data - and you can do with it, whatever you like. (Some internal groups use this to collate management reports which (for example) slice and dice data from different directories into summaries for the project lead of the corresponding subsystem. Their callback knows how to determine responsibility, what the component of each specific module are, etc - but genhtml does not need to know.)

Similarly, if you are using this to decide whether you have met your coverage target, then the return code of your criterial callback tells genhtml whether it has passed or failed - so genhtml can report to your jenkins - say, to fail the job if your criteria is not met. Think 'sleeping policeman.

Not sure if this helped. Henry

jdehaan commented 1 year ago

Thanks for the hint. We are generating pages to be read by users with full information (line and branch coverage) for review purposes. I believe this is maybe a little bit "weird" but that's how we do it. We also have the criteria and limits checked but to avoid people to loose time on generating and reproducing the reports we prepare these and put them under version control. That is the background, not sure if this is a recommended way :-P

I can understand that the change won't make it basically in the mainstream version. Maybe there is a chance with an additional flag to activate the behavior like --ordered-html-attributes? I suppose it'll depend on how many other people would enjoy having this.

That said I am fine with locally patching my version of lcov, thanks for the pointer to the code location!

henry2cox commented 1 year ago

I suggest to look at the --annotate-script callback and the --show-navigation feature (if you are using vanilla coverage reports). Combined, those will give your users a very fast way to get to their code. Similarly, the date bin feature (also enabled when you enable annotations) will let you find recently written or modified code, quite quickly.

You might also want to look into differential categorization - which gives some additional utility - especially around automation.

But yeah...the above may not be supported by your local tool.

(In our environment: jenkins produces a report which anyone can go look at (well..."authorized users" can) - but it also watches the coverage task just like any other test - so the job goes yellow or red if our coverage criteria fails. One can write a quite sophisticated coverage criteria, if one wants. Jenkins can also send automated hate mail to the most likely culprit(s)...which also reduces the load on the project lead. We try to avoid manual effort to the maximum extent possible.)

jdehaan commented 1 year ago

I had a look in the code:

        my %tags = ("width"   => $width,
                    "class"   => $class,
                    "colspan" => $colspan,
                    "title"   => $title);
        my $str = "            <td";
        while (my ($t, $v) = each(%tags)) {
            $str .= " $t=\"$v\""
                if defined($v);
        }

As the hash is always build with all keys and iterated over wouldn't an array do the job as well in a maybe even faster way? The variable looks like used locally only (but I am not a perl expert)

  my @tags = (
      ["width",   $width],
      ["class",   $class],
      ["colspan", $colspan],
      ["title",   $title]
  );

  my $str = "            <td";
  foreach my $tag (@tags) {
      my ($t, $v) = @$tag;
      $str .= " $t=\"$v\"" if defined($v);
  }

Beside my maybe somewhat exotic use case, I might argue it is better to yield reproducible results if that does not cost any additional effort / resource.

henry2cox commented 1 year ago

yes - local, used only for this purpose, and also tied to the parameter order of write_html_line. This code has changed a bit over time - so I wrote it this way for future proofing. You will notice that just about everything is a hash in this implementation :-) Easy enough to change - but won't make it into the lcov 2.0 release

jdehaan commented 1 year ago

Then I can propose to make the change for ourselves and test it, I can then propose a PR if it proves to work for us. Hashes are indeed nicer for accessing and merging in general you're right. This might be an exceptional case.

henry2cox commented 1 year ago

Unrelated...but I'm curious about your application and development environment and methodology. There is some discussion at #195, if you care to share.

henry2cox commented 1 year ago

Fixed this while waiting for my regression tests to finish. Arguable whether this version is harder to read/understand than the original - but should be faster (though again, likely not measureable). Does enable regressions to merely diff HTML entries - which might be useful at some point.

Also made it in time for the 2.0 release :-)

henry2cox commented 1 year ago

Closing as this seems to be addressed. (Normally, I prefer the person who opened the issue to close it...only that person really knows what was intended...) Cleaning up issue list for 2.0 release.

Henry

jdehaan commented 1 year ago

Perfectly fine. I just didn't want to mess around here. Thanks a bunch.

linux-test-project / lcov

Unstable order of HTML attributes inside the LCOV report #216