AdamSpannbauer / lexRankr

Extractive Text Summariztion with lexRankr (an R package implementing the LexRank algorithm)
Other
21 stars 4 forks source link

Test failure on i386 #22

Open tillea opened 1 year ago

tillea commented 1 year ago

Hi, since a couple of days (possibly due to a new upload of igraph version 1.3.5) the CI test of the Debian package fails for the i386 architecture with:

== Failed tests ================================================================
-- Failure ('test-lexRank.R:39'): object out value -----------------------------
`testResult` not equal to `expectedResult`.
Component "docId": Mean relative difference: 1
Component "sentenceId": 2 string mismatches
Component "sentence": 2 string mismatches

[ FAIL 1 | WARN 0 | SKIP 0 | PASS 142 ]

You can see this in the full test log.

I have added some debug code in this patch to visualise the issue:

> test_check("lexRankr")
[1] "DEBUG: expectedResult: c(2, 1, 3)"                                                                                    
[2] "DEBUG: expectedResult: c(\"2_1\", \"1_1\", \"3_1\")"                                                                  
[3] "DEBUG: expectedResult: c(\"Is everything working as expected in my test?\", \"Testing 1, 2, 3.\", \"Is it working?\")"
[4] "DEBUG: expectedResult: c(0.48649, 0.25676, 0.25676)"                                                                  
[1] "DEBUG: testResult: c(2, 3, 1)"                                                                                    
[2] "DEBUG: testResult: c(\"2_1\", \"3_1\", \"1_1\")"                                                                  
[3] "DEBUG: testResult: c(\"Is everything working as expected in my test?\", \"Is it working?\", \"Testing 1, 2, 3.\")"
[4] "DEBUG: testResult: c(0.48649, 0.25676, 0.25676)"                                                                  
[ FAIL 1 | WARN 0 | SKIP 0 | PASS 142 ]

As you can see the sequence of argument 2 and 3 is swapped thus the comparison fails. The i386 architecture seems to be the only one which is affected.

Kind regards, Andreas.

szhorvat commented 1 year ago

If I understand what lexRank is doing, then I would say that the test here is flawed. The result is ordered by these values: c(0.48649, 0.25676, 0.25676). Notice that the last two are essentially the same. Thus both 2, 1, 3 and 2, 3, 1 are valid results. Which one is actually returned by the function is down to tiny numerical roundoff errors, which may be platform-specific.

The test should either be changed so that there is only one valid result, or if that's not possible, both of these two rankings should be accepted.