dpc10ster / RJafroc

Artificial Intelligence: Evaluating AI, optimizing AI
19 stars 8 forks source link

New test fails on all but mac #70

Closed dpc10ster closed 3 years ago

dpc10ster commented 3 years ago

I added a new test following my usual style: see test-SsSampleSizeFroc.R in the tests directory:

It fails on all but latest mac.

To prevent failures I had to include the skip_on_os() as shown below; any idea? something has changed in package testthat.

contextStr <- "Sample Size FROC"
context(contextStr)
test_that(contextStr, {

  skip_on_os("windows") 
  skip_on_os("linux") 
  skip_on_os("solaris") 

  lesDistr <- c(0.7, 0.2, 0.1)
  frocNhData <- DfExtractDataset(dataset04, trts = c(1,2))

  fn <- paste0(test_path(), "/goodValues361/SsPower/FROC-dataset04", ".rds")
  if (!file.exists(fn)) {
    warning(paste0("File not found - generating new ",fn))
    ret <- SsFrocNhRsmModel(frocNhData, lesDistr = lesDistr)
    saveRDS(ret, file = fn)
  }

  x1 <- readRDS(fn)
  x2 <-SsFrocNhRsmModel(frocNhData, lesDistr = lesDistr)
  expect_equal(x1,x2)

})
pwep commented 3 years ago

test-SsSampleSizeFroc.R (Sample Sample Size FROC)

Mean relative difference as reported in the failed Actions log R-CMD-check #138

Variable Mac Windows Linux (release/devel)
muMed (identical) 0.0002416307 0.002548856
lambdaMed (identical) 0.0002416319 0.002548858
nuMed (identical) 0.0002274784 0.002538516
scaleFactor (identical) 0.0001097283 0.001151393
R2 (identical) (identical) 3.976227e-08
dpc10ster commented 3 years ago

I think I understand what is going on; the tested routine involves maximum likelihood estimation which does a random search in parameter space; so we need to set tolerance greater than the largest discrepancy, or save different goodValues for each operating system, or set tolerance so that Windows passes and skip-on-Linux. The third option is the simplest; what do you think? Dev

On Wed, Apr 14, 2021 at 7:48 PM Peter Phillips @.***> wrote:

test-SsSampleSizeFroc.R (Sample Sample Size FROC)

Mean relative difference as reported in the failed Actions log R-CMD-check

138 https://github.com/dpc10ster/RJafroc/actions/runs/749226277

Variable Mac Windows Linux (release/devel) muMed (identical) 0.0002416307 0.002548856 lambdaMed (identical) 0.0002416319 0.002548858 nuMed (identical) 0.0002274784 0.002538516 scaleFactor (identical) 0.0001097283 0.001151393 R2 (identical) (identical) 3.976227e-08

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dpc10ster/RJafroc/issues/70#issuecomment-819920902, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH4NJRDSQW443ZW7EW4RTLDTIYSWNANCNFSM426DIQ3A .

pwep commented 3 years ago

Third option sounds good to me. I see the test-SsSampleSizeROC.R uses the expect_equivalent function, and sets a tolerance.

dpc10ster commented 3 years ago

So I have been through this before; ha ha; thanks; can you make the change and issue a PR? dev

On Wed, Apr 14, 2021 at 8:08 PM Peter Phillips @.***> wrote:

Third option sounds good to me. I see the test-SsSampleSizeROC.R uses the expect_equivalent function, and sets a tolerance.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dpc10ster/RJafroc/issues/70#issuecomment-819927349, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH4NJREV5AME3XDPWKTHGALTIYU6RANCNFSM426DIQ3A .

dpc10ster commented 3 years ago

You can ignore this... just for my records

Actually not quite the same; test-SsSampleSizeROC.R does not use the random search method; a single comparison is made to a previously known good value

So the third option is best with expect_equivalent() instead of expect_equal() and tolerance about 1e-5.

On Wed, Apr 14, 2021 at 8:10 PM Dev Chakraborty @.***> wrote:

So I have been through this before; ha ha; thanks; can you make the change and issue a PR? dev

On Wed, Apr 14, 2021 at 8:08 PM Peter Phillips @.***> wrote:

Third option sounds good to me. I see the test-SsSampleSizeROC.R uses the expect_equivalent function, and sets a tolerance.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dpc10ster/RJafroc/issues/70#issuecomment-819927349, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH4NJREV5AME3XDPWKTHGALTIYU6RANCNFSM426DIQ3A .

pwep commented 3 years ago

Alas, I am not in front of PC where I can do pull requests, or can test on a Windows/Linux machine, however the code should be straighforward.

Remove expect_equal(x1,x2) from the test-SsSampleSizeFroc.R test file and replace with...

expect_equivalent(x1$muMed, x2$muMed, tolerance=5e-4)
expect_equivalent(x1$lambdaMed, x2$lambdaMed, tolerance=5e-5)
expect_equivalent(x1$nuMed, x2$nuMed, tolerance=5e-5)
expect_equivalent(x1$ScaleFactor, x2$ScaleFactor, tolerance=5e-5)
expect_equivalent(x1$R2, x2$R2, tolerance=1e-7)

I'll let you change the tolerances to fit your requirements, and see if it works.

We probably want a structural test in there somewhere to make sure 5 and only 5 correctly named variables are returned from the function SsFrocNhRsmModel.

dpc10ster commented 3 years ago

Thanks a million Peter; I agree; I will take it from here; Dev

On Wed, Apr 14, 2021 at 8:32 PM Peter Phillips @.***> wrote:

Alas, I am not in front of PC where I can do pull requests, or can test on a Windows/Linux machine, however the code should be straighforward.

Remove expect_equal(x1,x2) from the test-SsSampleSizeFroc.R test file and replace with...

expect_equivalent(x1$muMed, x2$muMed, tolerance=5e-4) expect_equivalent(x1$lambdaMed, x2$lambdaMed, tolerance=5e-5) expect_equivalent(x1$nuMed, x2$nuMed, tolerance=5e-5) expect_equivalent(x1$ScaleFactor, x2$ScaleFactor, tolerance=5e-5) expect_equivalent(x1$R2, x2$R2, tolerance=1e-7)

I'll let you change the tolerances to fit your requirements, and see if it works.

We probably want a structural test in there somewhere to make sure 5 and only 5 correctly named variables are returned from the function SsFrocNhRsmModel.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dpc10ster/RJafroc/issues/70#issuecomment-819935693, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH4NJRGV2Z4GTT3FM4KKO2LTIYX2FANCNFSM426DIQ3A .

pwep commented 3 years ago

A structural test could be comparing the list names (although this does not check the data types)

expect_identical( names(x1), names(x2) )

dpc10ster commented 3 years ago

Got it.

On Wed, Apr 14, 2021 at 8:40 PM Peter Phillips @.***> wrote:

A structural test could be comparing the list names (although this does not check the data types)

expect_identical( names(x1), names(x2) )

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dpc10ster/RJafroc/issues/70#issuecomment-819938220, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH4NJRFA64N2XQU5AZPK2YLTIYYYPANCNFSM426DIQ3A .