igraph / rigraph

igraph R package
https://r.igraph.org
553 stars 200 forks source link

Test Kamada-Kawai layout generator does not work on i386 #653

Closed tillea closed 1 year ago

tillea commented 1 year ago

Describe the bug As you can see in the CI test of the Debian package for i386 architecture it fails with

== Failed tests ============================================================
-- Failure ('test_layout.kk.R:52'): Kamada-Kawai layout generator works --------
`x` not equal to `expected`.
1/1 mismatches
[1] -1.83 - 0.0631 == -1.89
Backtrace:
    x
 1. \-igraph:::expect_that(sum(l), equals(0.0631144692360025)) at test_layout.kk.R:52:4
 2.   +-base::suppressWarnings(condition(object)) at autopkgtest_tmp/testthat/helper.R:43:2
 3.   | \-base::withCallingHandlers(...)
 4.   \-testthat (local) condition(object)
 5.     \-testthat::expect_equal(x, expected, ..., expected.label = label)

[ FAIL 1 | WARN 0 | SKIP 7 | PASS 3663 ]

To reproduce Just check the full CI log

Please note that this package is build with -ffloat-store in i386 since otherwise some error in lexRankr was occuring.

Version information

Kind regards, Andreas.

szhorvat commented 1 year ago

First, some general comments (not a direct response to Andreas).

The KK layout appears to be inherently unstable, i.e. extremely sensitive to numerical roundoff errors. This is not necessarily a problem in practical usage, as it's just a layout method. The output is not meant for analysis, only for visualization.

Here's the failing test line for 1.3.5:

https://github.com/igraph/rigraph/blob/1.3.5/tests/testthat/test_layout.kk.R#L52

Notice that the test is much too greedy. It will only pass if the layout produces precisely the expected coordinates. But since the output differs between platforms, the test has special cases for all of them. -ffloat-store changes the rounding behaviour, so of course the output changes and the test fails. It doesn't mean that anything's wrong.

IMO the test should be less stringent. It should only verify properties that are consistent between platforms, something like the look_circular() approach that's also used. The current test is much too fragile.


@tillea, the relevant info for you is:

EDIT: I think I found your report, @tillea https://github.com/AdamSpannbauer/lexRankr/issues/22

tillea commented 1 year ago

Notice that the test is much too greedy. It will only pass if the layout produces precisely the expected coordinates. But since the output differs between platforms, the test has special cases for all of them. -ffloat-store changes the rounding behaviour, so of course the output changes and the test fails. It doesn't mean that anything's wrong.

I confirm I understood that there is nothing wrong here in principle and I was tempted to exclude that specific test fir i386 from Debian CI. We have other examples in the Debian BTS where -ffloat-store was used to get some tests passing for instance in xts which could only be solved by even building R base with this option. Let me know if you are interested on more examples which are always affecting i386 architecture only.

IMO the test should be less stringent. It should only verify properties that are consistent between platforms, something like the look_circular() approach that's also used. The current test is much too fragile.

+1

@tillea, the relevant info for you is:

* This failure does not indicate a bug.

ACK

* I believe a new R/igraph release is imminent (up to @ntamas and @krlmlr). IMO, the test would ideally be fixed on our end.

Thanks for the information. So I'll leave the issue untouched for the moment. Please note: Debian will freeze really soon for the next stable release. Please let me know if you plan this next release in this week (which is fine) or later (than I would rather skip the test inside the package to meet the freeze date).

* Hearing that you need to build igraph with `-ffloat-store` is alarming. This should not be necessary. Any _analysis result_ (as opposed to a layout, like here) should be insensitive to roundoff errors. There are two possibilities: (1) Most likely, lexRankr has some incorrect tests (similar to how this KK layout test is bad). (2) It's also possible that there is a bug in igraph that we should be aware of. If you can figure out what exactly causes the failure in lexRankr when building without `-ffloat-store`, it would be quite useful for us. Note that we specifically test the C core of igraph with x87 math to make sure that there are no roundoff issues.

EDIT: I think I found your report, @tillea AdamSpannbauer/lexRankr#22

lexRankr upstream did not respond to that issue. If you might be reprocude it to see what might have cause the issue that would be great. Unfortunately I personally do not understand what's going on. I simply realised that the order of the sorted list has changed - no idea how to hunt this down to a rounding error.

Kind regards and thanks for you quick response, Andreas.

szhorvat commented 1 year ago

@tillea I commented on the lexRankr issue you opened. It seems to me that the test that is used there is flawed. I suggest not compiling R/igraph with -ffloat-store just to resolve the lexRankr issue. The lexRankr test should be fixed instead. According to the GCC docs, this option brings a significant performance penalty, so that's best avoided.

If you remove the -ffloat-store option, that will make this issue here "disappear" as well, but the IMO the KK tests in igraph are still flawed and should be fixed.

szhorvat commented 1 year ago

I want to leave fixing this test to someone who's actually fluent in R (@krlmlr ?), but here are some pointers, which I hope make the job faster:

For the 3rd graph, test that all vertices are at about the same distance from the centre, defined as the average of all their coordinates. Test that this distance is reasonable (check its value up to tolerance).

I'm pretty sure that get_radii() is buggy and layout - colMeans(layout) does not centre coordinates properly.

For the star graph $S_{30}$, the layout is something like this:

image

I suggest testing a smaller star graph, e.g. with 12 vertices, and checking that the graph centre is at the centre, and the rest are on a circle (i.e. same distance from the centre).

ntamas commented 1 year ago

Please let me know if you plan this next release in this week (which is fine) or later (than I would rather skip the test inside the package to meet the freeze date).

I'd say next week or the week after; this is simply because we need to jump through quite a few hoops (reverse dependency checks etc) to get the next release into CRAN, and it's unlikely that we can do that before the end of this week. Ultimately I'd like to ask @krlmlr what he thinks is feasible.

tillea commented 1 year ago

@tillea I commented on the lexRankr issue you opened. It seems to me that the test that is used there is flawed. I suggest not compiling R/igraph with -ffloat-store just to resolve the lexRankr issue. The lexRankr test should be fixed instead. According to the GCC docs, this option brings a significant performance penalty, so that's best avoided.

Given that the -ffloat-store option is used exclusively on i386 architecture which is slow anyway and thus rarely used on time critical applications we gave precision preference over speed. I agree that I would prefer that lexRankr should be fixed but I do not have the slightest idea what to do. Any patch would be really welcome.

If you remove the -ffloat-store option, that will make this issue here "disappear" as well, but the IMO the KK tests in igraph are still flawed and should be fixed.

ACK.

tillea commented 1 year ago

I'd say next week or the week after; this is simply because we need to jump through quite a few hoops (reverse dependency checks etc) to get the next release into CRAN, and it's unlikely that we can do that before the end of this week. Ultimately I'd like to ask @krlmlr what he thinks is feasible.

Thanks for this information. So I will try to prevent any pressure and exclude the test for the moment. If the release of the new version might come right in time I'll include it into stable.

krlmlr commented 1 year ago

Bottom line: the test is fragile, we want a better way to test it, as outlined in https://github.com/igraph/rigraph/issues/653#issuecomment-1408306022.

github-actions[bot] commented 5 months ago

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.