James-Thorson-NOAA / VAST

Spatio-temporal analysis of univariate or multivariate data, e.g., standardizing data for multiple species or stages
http://www.FishStats.org
GNU General Public License v3.0
123 stars 53 forks source link

Changes in results when upgrading R to version >= 4.0.0 #244

Closed hhamazaki closed 4 years ago

hhamazaki commented 4 years ago

Running VAST (v9_2_0), I got warnings of Warning messages: 1: In showSRID(uprojargs, format = "PROJ", multiline = "NO") : Discarded datum Unknown based on WGS84 ellipsoid in CRS definition 2: In showSRID(uprojargs, format = "PROJ", multiline = "NO") : Discarded datum Unknown based on WGS84 ellipsoid in CRS definition 3: In sp::spTransform(posTri, CRSobj = map_data@proj4string) : NULL target CRS comment, falling back to PROJ string

In plot_results() section,

Skipping plot of quantile residuals

Warning messages: 1: In spTransform(xSP, CRSobj, ...) : NULL source CRS comment, falling back to PROJ string 2: In spTransform(xSP, CRSobj, ...) : NULL source CRS comment, falling back to PROJ string

Diag--Encounter_prob.png is not produced.

Also, compared to the results (with same data. model setting, and plot setting). The latest model produced greatly different ln_density spatial distribution map. Latest run ln_density-predicted Previous run ln_density

Here is model setting

settings = make_settings( n_x=50, Region="Other", purpose="index2",bias.correct=FALSE, FieldConfig=c("Omega1"=1, "Epsilon1"=1, "Omega2"=0, "Epsilon2"=0), Version="VAST_v9_2_0", use_anisotropy=TRUE)

Run model for limited area

fit = fit_model("settings"=settings, "Lat_i"=data[,'Latitude'], "Lon_i"=data[,'Longitude'], "t_i"=data[,'Year'], "c_i"=rep(0,nrow(data)), "b_i"=data[,'Totalmale'], "a_i"=data[,'Swept_km2'], "Q_ik"=Q_ik, "observations_LL"=cbind("Lat"=data[,'Latitude'],"Lon"=data[,'Longitude']), getsd=TRUE, newtonsteps=1, maximum_distance_from_sample=50, knot_method="samples" )

I also ran this with VAST_v8_2_0 that was run in 6 month ago, and the results were same as VAST_v9_2_0.

James-Thorson-NOAA commented 4 years ago

Thanks for contacting about this. VAST has many integrated tests that are run every time a new numbered release is issued, and I take backwards compatibility very seriously.

Comparing the run from 6 months ago and the run using the latest version, are the parameter estimates (i.e., in "parameter_estimates.txt") identical or different between the runs?

I'm on leave, but would be happy to respond regarding a backwards compatibility issue such as this.

hhamazaki commented 4 years ago

Yes, parameters are different between the runs. I also updated R and Rtools, which is causing may other issues. I will find someone who have not updated R and can the model to see if they get the same/similar results as I got 6 months ago.

James-Thorson-NOAA commented 4 years ago

and what was the VAST and FishStatsUtils versions listed in packageVersion.txt in the old run? Have you tried re-installing those from github and repeating the run?

Also would you be willing to share the data and code with me for me to compare between versions? I'd obviously like to figure out what might cause the difference.

Cole-Monnahan-NOAA commented 4 years ago

For what it's worth, I also get those warnings about the projections and have just been ignoring them. I figure that's something we can look into more when Jim is back (low priority).

The backwards compatibility issue is a much more serious thing. I would suggest creating a new issue for it with that in the title and give a reproducible example if you can. Save the "fit" object as a RDS for the two versions so we can check all the components to see the differences. It might also be useful to check that negative log-likelihood and gradients are the same at the initial values. You can do this by setting run_model=FALSE in fit_model call and then

init.fn <- fit$tmb_list$Obj$fn()
init.gr <- fit$tmb_list$Obj$gr()

Save that output for the two versions as well, then optimize the models as normal (run_model=TRUE), then add on those with fit$init.fn=init.fn etc. before saving.

I know it is a lot of work to switch versions of everything so it's worth saving all the potentially relevant pieces so you only have to do it once (hopefully!).

James-Thorson-NOAA commented 4 years ago

I'd still be curious to revisit this thread. What was the VAST and FishStatsUtils versions listed in packageVersion.txt in the results for the "Previous run" and for the "Latest run"?

I will then try running those two versions on a different data set and see if I can replicate the issue.

James-Thorson-NOAA commented 4 years ago

For what its worth, I have just tried running the "simple example" on the Wiki using VAST 3.3.0 / FishStatsUtils 2.5.0 / CPP 8.5.0 (my best guess at the "Previous run") and using VAST 3.5.0 / FishStatsUtils 2.7.0 / CPP 9.4.0 (my guess at the "Latest run"), but modifying the example to use Region="Other" to match that characteristic of the context. I get the same MLE for both runs (see code below). I don't doubt that there is some issue to be resolved here, but I'm also not sure how best to proceed to track down the potential dependencies that could account for our difference (backwards compatibility working on my machine but not on your machine). Perhaps its worth emailing me directly so we can explore the issue in more detail and report back.

# Download latest release number; its useful for reproducibility to use a specific release number
#devtools::install_local("C:/Users/James.Thorson/Desktop/Work files/AFSC/2020-08 -- GitHub issue 244/FishStatsUtils-2.5.0.zip", dep=FALSE, force=TRUE)
#devtools::install_local("C:/Users/James.Thorson/Desktop/Work files/AFSC/2020-08 -- GitHub issue 244/VAST-3.3.0.zip", dep=FALSE, force=TRUE)
devtools::install_local("C:/Users/James.Thorson/Desktop/Work files/AFSC/2020-08 -- GitHub issue 244/FishStatsUtils-2.7.0.zip", dep=FALSE, force=TRUE)
devtools::install_local("C:/Users/James.Thorson/Desktop/Work files/AFSC/2020-08 -- GitHub issue 244/VAST-3.5.0.zip", dep=FALSE, force=TRUE)

# Set local working directory (change for your machine)
setwd( "C:/Users/James.Thorson/Desktop/Work files/AFSC/2020-08 -- GitHub issue 244" )

# Load package
library(VAST)

# load data set
# see `?load_example` for list of stocks with example data
# that are installed automatically with `FishStatsUtils`.
example = load_example( data_set="EBS_pollock" )

# Make settings (turning off bias.correct to save time for example)
settings = make_settings( n_x=100,
  Region="Other",
  purpose="index",
  strata.limits=example$strata.limits,
  bias.correct=FALSE )

# Run model
fit = fit_model( settings=settings,
  Lat_i=example$sampling_data[,'Lat'],
  Lon_i=example$sampling_data[,'Lon'],
  t_i=example$sampling_data[,'Year'],
  c_i=rep(0,nrow(example$sampling_data)),
  b_i=example$sampling_data[,'Catch_KG'],
  a_i=example$sampling_data[,'AreaSwept_km2'],
  v_i=example$sampling_data[,'Vessel'],
  observations_LL=cbind("Lat"=example$sampling_data[,'Lat'],"Lon"=example$sampling_data[,'Lon']),
  getsd=FALSE,
  newtonsteps=0 )
James-Thorson-NOAA commented 4 years ago

OK: after some more investigating, I could replicate the problem using vanilla R 4.0.2 where:

  1. CPP V 9.4.0 doesn't compile as reported separately (#242)
  2. Backwards compatibility is broken, i.e., the simple example gives one MLE (and hence parameter estimates, predictions, etc.) using R 4.0.2 and another using my previous install using MRAN 3.5.3.

It appears that the issue arises because R has changed how the stats::kmeans function interacts with set.seed, such that previous versions would generate one set of knots (based on data or extrapolation-grid) and newer versions generate a slightly different set of knots (based on the same locations). I.e., different R installations result in different results using the following code.

  example = load_example( data_set="EBS_pollock" )
  set.seed( round(1) );
  Tmp = stats::kmeans( x=example$sampling_data[,c("Lon","Lat")], centers=settings$n_x, iter.max=1000, nstart=1, trace=0)
  summary(Tmp$centers)

One work-around would be to save the "Kmeans.RData" file from a previous run using an older R version, and put that it in the RunDir, such that it uses the same Kmeans and hence achieves the same MLE.

I will have to think about how to resolve the issue more generally, and whether a solution can be found that retains exact backwards compatibility.

James-Thorson-NOAA commented 4 years ago

@hhamazaki Would you be willing to test this diagnosis by putting the old Kmeans.RData file in your new RunDir using the "Latest version" and try re-running to see if that achieves the same result as your "Previous run"?

Cole-Monnahan-NOAA commented 4 years ago

@James-Thorson-NOAA that might have been unintentional and a bug. It might be worth reporting. It's going to be an issue for a lot of downstream packages/analyses.

James-Thorson-NOAA commented 4 years ago

@Cole-Monnahan-NOAA sorry, what are you saying was unintentional and a bug?

I intend next to confirm my diagnosis that the "broken backwards compatibility" occurring between R 3.5.3 and R 3.6.3 is due to a change in how setting the random-seed affects the kmeans(.) function, and hence can be resolved where necessary by appropriately re-using the "Kmeans.RData" object from old runs. I hope to also determine what causes this change in kmeans(.)

Cole-Monnahan-NOAA commented 4 years ago

I mean that the kmeans change could be a bug. That's a big deal to break backwards compatibility. So a bug report to R devs I guess? Just a thought.

Cole-Monnahan-NOAA commented 4 years ago

We really need to start a new issue with a better title for this.

OK I was able to recreate in base R and isolate the source of the error. It's a change in the sample.int function, which is used to select random rows as the initial centers. This was changed in 3.6.0.

set.seed(121)
sample.int(100, 5)
# R 3.5.3
# [1] 40 95 54 74 53

# R 3.6.3
# [1] 52 28 12 71  4

See the change log here: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17494

I think you can avoid this in two ways. First, by manually initializing the algorithm by specifying the initial centers instead of the number of centers you want. E.g., instead of this:

kmeans( x, centers=n_x) change to kmeans( x, centers=x[1:n_x,])   Then the algorithm is deterministic and backwards compatible for the versions of R I looked at (3.5.3 and 3.6.3). That would be backwards compatible moving forward (although it is unlikely they will change their RNG approach again).

The second approach is to use the old RNG, by setting it (see ?Sample),

RNGkind(sample.kind="Rounding"))

I don't know if this is a great solution b/c that will set the RNG for the whole R session and could interact with other components and make it nonreprodubile. But for users trying to reproduce an old fit this is the best approach.

James-Thorson commented 4 years ago

So do you agree that the first approach is superior?

I'll look into a hotfix intended for VAST release 3.5.2, and will likely try running integrated tests overnight for release tomorrow

On Wed, Aug 5, 2020, 8:38 AM Cole Monnahan notifications@github.com wrote:

We really need to start a new issue with a better title for this.

OK I was able to recreate in base R and isolate the source of the error. It's a change in the sample.int function, which is used to select random rows as the initial centers. This was changed in 3.6.0.

set.seed(121) sample.int(100, 5)

R 3.5.3

[1] 40 95 54 74 53

R 3.6.3

[1] 52 28 12 71 4

See the change log here: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17494

I think you can avoid this in two ways. First, by manually initializing the algorithm by specifying the initial centers instead of the number of centers you want. E.g., instead of this:

kmeans( x, centers=n_x) change to kmeans( x, centers=x[1:n_x,])

Then the algorithm is deterministic and backwards compatible for the versions of R I looked at (3.5.3 and 3.6.3). That would be backwards compatible moving forward (although it is unlikely they will change their RNG approach again).

The second approach is to use the old RNG, by setting it (see ?Sample),

RNGkind(sample.kind="Rounding"))

I don't know if this is a great solution b/c that will set the RNG for the whole R session and could interact with other components and make it nonreprodubile. But for users trying to reproduce an old fit this is the best approach.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/James-Thorson-NOAA/VAST/issues/244#issuecomment-669265367, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB46UTPB6UVRAGMCK477SU3R7F4GFANCNFSM4PHEU62A .

Cole-Monnahan-NOAA commented 4 years ago

I disagree. It's superior in the long-term, because it ensures forward compatibility, but it breaks backwards compatibility. So I think overall it's not a good approach. Another potential issue is how to select the rows. I put the first ten, but for multivariate models you might have identical coordinates for all these.

Honestly, I would strongly consider ignoring this and having a note that if you cross the 3.6.0 threshold you may get different answers and to carry along the .Rdata file or set the RNG like above. Presumably the model fits are very similar with slightly different meshes, if they aren't then your model probably isn't reliable anyway. Assuming R doesn't change their sampling functions again then you'll be forward compatible for anything > 3.6.0.

I'm curious how your tests are setup. You must set a seed right? None of the wiki examples have seeds so they technically wouldn't be reproducible using the default kmeans behavior when you specify the number of centers. Note that the algorithm can and will find the same clusters given different initializations sometimes and so you will get a reproducible answer almost by accident. So consider this when running integrated tests.

James-Thorson-NOAA commented 4 years ago

All of the integrated tests are here: https://github.com/James-Thorson-NOAA/VAST/tree/master/tests/testthat. Early integrated-tests (typically involving simple features) used the Kmeans.RData file from the original run (and hence don't illustrate this whole issue). I did this to save time on TravisCI -- it had a limit on the total number of test minutes, so it mattered when I still used Travis. Later integrated-tests (involving more recent features) typically run the kmeans(.) each time, but I dind't notice this issue because I will still using R 3.5.3.

Also, I should have said earlier: thanks @Cole-Monnahan-NOAA! I don't know that I would have thought to check the change-log, and really appreciate your insight in tracking down the cause.

James-Thorson-NOAA commented 4 years ago

@Cole-Monnahan-NOAA I'd like to maintain backwards compatibility (not just in expectation, but identical) if there's a reasonable way to do so.

What about using the 2nd approach but modifying Calc_Kmeans to do (using pseudocode):

RNGkind_orig = RNGkind()
if( length(RNGkind_orig)==3 ) RNGkind( sample.kind="Rounding" )
[THEN DO EXISTING FUNCTION OPERATIONS]
if( length(RNGkind_orig)==2 ) RNGkind( kind=RNGkind_orig[1], normal.kind=RNGkind_orig[2] )
if( length(RNGkind_orig)==3 ) RNGkind( kind=RNGkind_orig[1], normal.kind=RNGkind_orig[2], sample.kind=RNGkind_orig[3])

This seems like a relatively simple way to accommodate format both pre- and post-change in sample.int, and also hopefully to maintain backwards compatibility. If you see no problems, I could try exploring...?

Cole-Monnahan-NOAA commented 4 years ago

I like your commitment to backward compatibility, but in this case R broke it b/c of a bug fix. That's a very valid reason to break it in VAST. At some point one of the dependent packages will change and you'll get something different (e.g., when TMB switches to a different AD library), breaking backward compatibility. I think the key is that a user can reproduce an analysis exactly and then bridge it to the new versions. There are three ways to do this: (1) using the .Rdata file (2) changing the RNG or (3) revert to the used version of R and the dependent packages. That is plenty flexible.

I don't like this above approach b/c it's a bit of a hack and could cause very unexpected behavior if something unforeseeable went wrong.

I think we should loop in the other steering committee members @ChristineStawitz-NOAA and @ericward-noaa to see if they have thoughts.

James-Thorson-NOAA commented 4 years ago

OK, I just started a separate email thread.

FWIW, I just tested the code below and it works on R 3.5.3 and R 4.0.2 (i.e., appears to fix the cause of the original issue):

  RNGkind_orig = RNGkind()
  if( !any(length(RNGkind_orig)==c(2,3)) ) stop("Check code") 
  if( length(RNGkind_orig)==3 ) RNGkind( sample.kind="Rounding" )
  set.seed( round(1) );
  Tmp = stats::kmeans( x=example$sampling_data[,c("Lon","Lat")], centers=settings$n_x, iter.max=1000, nstart=1, trace=0)
  summary(Tmp$centers)
  if( length(RNGkind_orig)==2 ) RNGkind( kind=RNGkind_orig[1], normal.kind=RNGkind_orig[2] )
  if( length(RNGkind_orig)==3 ) RNGkind( kind=RNGkind_orig[1], normal.kind=RNGkind_orig[2], sample.kind=RNGkind_orig[3])

Also, I'm doubtful it would have any unforseeable consequences, given that its partitioned within the Calc_Kmeans(.) function, has a logical test for detecting future changes in RNGkind, and Calc_Kmeans(.) would revert RNGkind to the original system state after being called.

sean-rohan-NOAA commented 4 years ago

The PROJ warnings pop-up because R spatial packages switched from PROJ4 to PROJ6/GDAL3 for datum transformations. The change means that WGS84 is no longer used as an intermediary for transformations. All sorts of warnings can appear when using PROJ4 strings instead of WKT2 CRS. http://rgdal.r-forge.r-project.org/articles/PROJ6_GDAL3.html

James-Thorson-NOAA commented 4 years ago

Thanks @sean-rohan-NOAA for clarifying this and pointing to that document.

I independently worked out that these warning messages in R 4.0.2 go away if switching FishStatsUtils::project_coordinates(.) to use orig_args = "+proj=longlat +datum=WGS84" instead of orig_args = "+proj=longlat +ellps=WGS84", and that this change doesn't otherwise change any of my integrated tests for R 3.5.3 or 4.0.2. It appears that using the new PROJ6 format, the datum and ellps arguments are both unnecessary, but that they might be helpful to retain for backwards compatibility. If anyone has any more information (or a correction in my logic here) please feel free to post here.

I intend to make this change at the next planned release (FishStatsUtils 2.8.0) rather than as a hotfix (2.7.1) given that the problem appears to only cause an annoying warning rather than any error or functionality problem.

ericward-noaa commented 4 years ago

Thanks for the efforts getting to the bottom of this @Cole-Monnahan-NOAA and @James-Thorson . One bigger picture thought is given the dependencies in VAST, issues with TMB compilers, etc., maybe it'd be worth sinking effort into archiving these VAST releases on Docker. There's a few ways to to do this with 'rocker', 'containerit', etc. This would basically be an alternative to Cole's # 3: 'revert to the used version of R and the dependent packages'

ChristineStawitz-NOAA commented 4 years ago

I agree with Cole's point earlier about a note in release notes for 3.6+ regarding changes in the estimate pre- and after version 3.6.0. I think Eric's suggestion would be even fancier - of tools that build a Docker container out of existing software, I've heard good things about Binder.

I am pretty hardline about not maintaining backwards compatibility. I wrote a bit of a manifesto about it here. I suppose this is colored by my time at Microsoft where backwards compatibility is responsible for 99% of the sticky coding problems and bugs that Windows is notorious for. Generally on these topics I will advocate for breaking backwards compat over adding another if statement to code every time.

James-Thorson-NOAA commented 4 years ago

Well, I'm happy to think more about points from you both @Cole-Monnahan-NOAA and @ChristineStawitz-NOAA. While I do, do you have any advice about how best to communicate instances where backwards compatibility is broken due to dependencies (base-R changing integer-sampling, affecting kmeans)? I could envision:

  1. Adding lines to the start-up message informing users;
  2. adding something to Doxygen reference documents (although I'm not sure which function it belongs under, I guess fit_model is the most generic and hence the best for stuff like this).

I think if backwards compatibility is broken that it needs to be clearly communicated, with an option for users to see why (and understand that its not a bug in previous versions or a new bug or something)

Cole-Monnahan-NOAA commented 4 years ago

Jim: would you please edit the title of this thread to better reflect the content? We've diverged greatly and I want others to be able to see and participate in this discussion and frankly they'll miss it with the current title.

I think Christine makes some really good points in her manifesto, many of which I hadn't thought of before. Specifically, if you aren't explicitly testing each previous version then you can't claim it's backwards compatible. Looking at your tests this is not true. "If it is expected that users continue to move forward (for example, updating their R version), there is no need to support backwards compatibility." This is definitely the case with VAST. The changes others make to dependent packages are out of your control (e.g., R changing sample). Furthermore, anecdotally it's already broken in my experience. I have helped many users who could not get old code to run after just a single VAST or FishStatsUtils update.

But really, I think what we want out of VAST is not backward compatibility. A user should not be using versions of VAST from 2 years ago. What they should be able to do is reproduce an analysis from 2 years ago, and bridge it into the current version. This is especially true in the context of scientific software. Eric gave one fancy, more comprehensive, way of doing that, but we also showed how to do it in this specific case by reusing the kmeans Rdata output, or changing the RNG.

I'm not worried about the extra computation times due to the extra if statements. But it makes the code very difficult to read and therefore develop, check, and test. That's an extra burden on Jim and hurdle for others to contribute. It seems to me the extra time maintaining that backwards compatible code would be better spent in other places (testing, documentation, etc.). If analyses are reproducible then that is sufficient in my mind. I would recommend writing a short wiki article about reproducibility and then potentially printing the URL on starting up VAST.

In short, I think VAST would benefit moving forward by eliminating the burden of (trying to) maintain all this legacy code.

ChristineStawitz-NOAA commented 4 years ago

I think a version doc like this would be appropriate, I always read these when a new R version comes out. You could also reference the R change that led to this change in behavior in such a doc. I also think you could leave the if code you have but instead of changing the behavior, print a warning about the RNG change if someone is using an old version of R.

Adding to the documentation in addition to some kind of reference doc is sufficient I think..a start up message feels heavy handed.

I agree the goal is not for people to be using legacy versions of VAST but being able to reproduce prior analyses. A buddy tool, kind of like what ICES have in the TAF where you can reproduce VAST models by referencing specific versions in a structured way could be useful. You have a really good versioning system and history so it is already quite easy to find older versions of VAST if you need them and not assume people are running original VAST models using the current version..

James-Thorson-NOAA commented 4 years ago

We have identified the cause of this original issue (where results changed when upgrading to R >= 4.0.0. The next numeric release will include a fix to test backwards compatibility. I'm therefore closing this specific issue, but invite ongoing discussion about backwards compatibility in #245