traitecoevo / baad.data

Access data in baad
Other
6 stars 4 forks source link

Determine why tests are failing on windows #6

Closed dfalster closed 8 years ago

dfalster commented 8 years ago

Tests of baad.data are failing on windows machines (via Appveyor). In particular, the test checking the objects return the same hash is failing. From Appveyor build1.07, we have

  1. Failure: ecology version (@test-baad-data.R#12) 

  Error: testthat unit tests failed

Line 12 is the test for object hash.

Running on my machine I get the same output as is encoded in the test"

d <- baad.data::baad_data("1.0.0")
storr:::hash_object(d)
[1] "7c59e15a5d56752775e8f8e9748e3556"

@RemkoDuursma can you confirm what you get on your windows machine when you run the above two lines?

It would be great if you could also download the baad.data repo and run the tests to see if they pass.

dfalster commented 8 years ago

@RemkoDuursma, can you please run the following to help me diagnose an error, then paste results in here? You'll need to change the variable

path <- tempfile()
library(baad.data)
d <- baad_data("1.0.0")
storr:::hash_object(d)
d <- baad_data("1.0.0", path)
storr:::hash_object(d)
baad_data_del("1.0.0", path)
sessionInfo()
RemkoDuursma commented 8 years ago
> path <- tempfile()
> library(baad.data)
> d <- baad_data("1.0.0")
> storr:::hash_object(d)
[1] "c1acf54690ec511801477ab3e83f0b95"
> d <- baad_data("1.0.0", path)
  |======================================================================================================| 100%
> storr:::hash_object(d)
[1] "21aaa8f37193d849b36c7346476afa13"
> baad_data_del("1.0.0", path)
> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252    LC_MONETARY=English_Australia.1252
[4] LC_NUMERIC=C                       LC_TIME=English_Australia.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] baad.data_1.0.1

loaded via a namespace (and not attached):
 [1] httr_1.2.1      R6_2.1.2        rsconnect_0.4.3 tools_3.3.1     curl_0.9.7      rappdirs_0.3.1 
 [7] datastorr_0.0.3 jsonlite_1.0    digest_0.6.9    bibtex_0.4.0    storr_1.0.1    
dfalster commented 8 years ago

Hi @richfitz, I need your help to resolve why one of our tests are failing. The origins of the problem seem to trace to deep within storr and datastorr, in particular unpacking of downloaded products, so requires the main architect's eyes.

The test that fails is Line 12 is the test for object hash. This test was added presumably to verify the integrity of the products, i.e. whether there are significant changes. Well, a breaking change has arisen, so I want to know whether it is serious.

I suspect the problem may have arisen following replumbing of storr of datastorr (e.g. in history here), but only surfaced now when I reran the tests locally and on a windows machine -- they still pass on travis.

Anyway, here is the issue. Run the following:

d <- baad_data("1.0.0")
storr:::hash_object(d)

We were expecting a value of "7c59e15a5d56752775e8f8e9748e3556". We do get this value on Travis, also in a docker container.

But on my mac (and also James's mac) we get:

> d <- baad_data("1.0.0")
> storr:::hash_object(d)
[1] "a8c493844b2054aba696fce3f13ddd9d"

> sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)

locale:
[1] en_US.UTF-8/en_AU.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] baad.data_1.0.1

loaded via a namespace (and not attached):
[1] R6_2.1.2        tools_3.3.0     rappdirs_0.3.1  datastorr_0.0.3
[5] digest_0.6.9    bibtex_0.4.0    storr_1.0.1 

Then on Remko's Window's machine we get a value of "21aaa8f37193d849b36c7346476afa13", which is different to the value obtained on Appveyor of "e4c3df9544f6312ba6b77bc0909ed8a4". (see previous comment for sessionInfo).

It's also worth noting that I USED to get the same result as linux machines on my mac. You can see that results have also changed for Remko's - his comment above shows a different hash between the version cached on his machine (probably created some time ago) and a fresh download.

So in summary:

  1. Any idea why the hash has changed on my machine, but not on Travis or in Docker?
  2. Is is correct that we get a different hash on windows, linux and mac?
  3. If so, what shall we test to ensure integrity of the data products?
dfalster commented 8 years ago

(and for posterity, here is code for running in docker)

Launch docker container

docker run -it dfalster/baad

Then inside the container, launch R and run

devtools::install_github("richfitz/datastorr")
devtools::install_github("traitecoevo/baad.data")
d <- baad.data::baad_data("1.0.0")
storr:::hash_object(d)
richfitz commented 8 years ago

Interesting, and quite worrying. I'll try and replicate this today, and hopefully get onto it this week. My bet, if it involves unzipping, is that there's been some changes to R's unzip functions. The digest stuff should be pretty solid as it's depended on by heaps of packages

dfalster commented 8 years ago

Thanks, I'll await your findings.

richfitz commented 8 years ago

I get the expected hash on my Linux machine and on OS/X with R 3.2.3.

On windows with R 3.3.1 I get the same hash as remko.

richfitz commented 8 years ago

traced the issue so far to a change in what bibtex::read.bib(bib_file) has produced. The error is coming from the baad_unpack function. You can explore this with:

debug(baad.data:::baad_unpack)
path <- tempfile()
d <- baad.data::baad_data("1.0.0", path)

and step through (with n) until you've got to the point where baad[["bib"]] has been created.

richfitz commented 8 years ago

Here is the culprit:

[1] "Component “Markesteijn2009”: Component “abstract”: 1 string mismatch"

So, on windows bibtex is (I think) reading the \r\n to \n in the abstract for Markesteijn2009

As to what to do: you could try tweaking the baad_data function so that it agrees across all platforms? You're highly unlikely to recover the exact hash as before, but it should not matter that much as you won't change the upstream data at all.

dfalster commented 8 years ago

Wow, well found rich. Thanks very much for investigating. I'll implement a fix next week.

dfalster commented 8 years ago

Indeed, there are some bad line endings in bib files. Taking hash of just the data component of 1.0.0 produces consistent results for me on OSX and linux (via docker):

> d <- baad.data::baad_data("1.0.0")
  |======================================================================| 100%
> storr:::hash_object(d[["data"]])
[1] "16e346bcc5a49c10a3974b6ac149749f"

So I will adjust test to check that on v 1.0.0. and add a check on hash for entire object on a later release.

dfalster commented 8 years ago

We had some routines for cleaning lines endings in baad, but these were only being applied to csv files.

dfalster commented 8 years ago

But appears (to me) that it is not in fact the line endings (or possibly there are two different things going on), but rather bibtex's formatting of authors, which is behaving differently on OSX and linux. In particular, the way names like "van Breugel", "De Reffye", "von Lüpke" are handled.

I've just been comparing contents of d <- baad.data::baad_data("1.0.0") obatined on my mac and in docker (i.e. linux):

> d <- baad.data::baad_data("1.0.0")  #docker
  |======================================================================| 100%
> storr:::hash_object(d)
[1] "7c59e15a5d56752775e8f8e9748e3556"
> d2 <- readRDS("/root/data/1.0.0.OSX.rds") #OSX
> storr:::hash_object(d2)
[1] "a8c493844b2054aba696fce3f13ddd9d"
> storr:::hash_object(d[["data"]])
[1] "16e346bcc5a49c10a3974b6ac149749f"
> storr:::hash_object(d2[["data"]])
[1] "16e346bcc5a49c10a3974b6ac149749f"
> all.equal(d2, d)
[1] "Component “bib”: Component “Petritan2009”: Component “author”: Component 2: Component 1: Lengths (1, 2) differ (string compare on first 1)"  
[2] "Component “bib”: Component “Petritan2009”: Component “author”: Component 2: Component 2: 1 string mismatch"                                  
[3] "Component “bib”: Component “vanBreugel2011”: Component “author”: Component 1: Component 1: Lengths (1, 2) differ (string compare on first 1)"
[4] "Component “bib”: Component “vanBreugel2011”: Component “author”: Component 1: Component 2: 1 string mismatch"                                
[5] "Component “bib”: Component “Wang2011”: Component “author”: Component 7: Component 1: Lengths (1, 2) differ (string compare on first 1)"      
[6] "Component “bib”: Component “Wang2011”: Component “author”: Component 7: Component 2: 1 string mismatch"                                      

> compare_hash_elements <- function(x1, x2) {
+ 
+   i <- sapply(x1, storr:::hash_object) == sapply(x2, storr:::hash_object) 
+   names(x1)[!i]
+ }
> compare_hash_elements(d, d2)
[1] "bib"
> compare_hash_elements(d[["bib"]], d2[["bib2"]])
character(0)
> compare_hash_elements(d[["bib"]], d2[["bib"]])
[1] "Petritan2009"   "vanBreugel2011" "Wang2011"   

> d[["bib"]][["Wang2011"]]$author
[1] "Feng Wang"          "Mengzhen Kang"      "Qi Lu"             
[4] "Véronique Letort"   "Hui Han"            "Yan Guo"           
[7] "Philippe De Reffye" "Baoguo Li"         
> d2[["bib"]][["Wang2011"]]$author
[1] "Feng Wang"          "Mengzhen Kang"      "Qi Lu"             
[4] "Véronique Letort"   "Hui Han"            "Yan Guo"           
[7] "Philippe De Reffye" "Baoguo Li"         
> unlist(d2[["bib"]][["Wang2011"]]$author)
      given      family       given      family       given      family 
     "Feng"      "Wang"  "Mengzhen"      "Kang"        "Qi"        "Lu" 
      given      family       given      family       given      family 
"Véronique"    "Letort"       "Hui"       "Han"       "Yan"       "Guo" 
      given      family       given      family 
 "Philippe" "De Reffye"    "Baoguo"        "Li" 
> unlist(d[["bib"]][["Wang2011"]]$author)
      given      family       given      family       given      family 
     "Feng"      "Wang"  "Mengzhen"      "Kang"        "Qi"        "Lu" 
      given      family       given      family       given      family 
"Véronique"    "Letort"       "Hui"       "Han"       "Yan"       "Guo" 
     given1      given2      family       given      family 
 "Philippe"        "De"    "Reffye"    "Baoguo"        "Li" 

> unlist(d[["bib"]][["vanBreugel2011"]]$author)
     given1      given2      family       given      family       given 
  "Michiel"       "van"   "Breugel"  "Johannes"   "Ransijn"     "Dylan" 
     family       given      family      given1      given2      family 
   "Craven"     "Frans"   "Bongers" "Jefferson"        "S."      "Hall" 
> unlist(d2[["bib"]][["vanBreugel2011"]]$author)
        given        family         given        family         given 
    "Michiel" "van Breugel"    "Johannes"     "Ransijn"       "Dylan" 
       family         given        family        given1        given2 
     "Craven"       "Frans"     "Bongers"   "Jefferson"          "S." 
       family 
       "Hall" 

> unlist(d[["bib"]][["Petritan2009"]]$author)
     given     family     given1     given2     family      given     family 
     "Any" "Petriţan" "Burghard"      "von"    "Lüpke"      "Ion" "Petriţan" 
> unlist(d2[["bib"]][["Petritan2009"]]$author)
      given      family       given      family       given      family 
      "Any"  "Petriţan"  "Burghard" "von Lüpke"       "Ion"  "Petriţan" 
dfalster commented 8 years ago

Alas, even when just hashing on data element of version 1.0.0, the has on windows differs to mac and osx (appveyor test still failing).

@RemkoDuursma can you please run the following and upload the two RDS files in a zip below? (github doesn't like rds but accepts zip)

library(baad.data)
d <- baad_data("1.0.0")
storr:::hash_object(d)
saveRDS(d, "win1.rds")
path <- tempfile()
d2 <- baad_data("1.0.0", path)
storr:::hash_object(d2)
saveRDS(d2, "win2.rds")
RemkoDuursma commented 8 years ago

Done. win12.zip

dfalster commented 8 years ago

Thanks heaps! Can you also confirm the output of following:

library(baad.data)
d <- baad_data("1.0.0")
storr:::hash_object(d[["data"]])
path <- tempfile()
d2 <- baad_data("1.0.0", path)
storr:::hash_object(d2[["data"]])
RemkoDuursma commented 8 years ago
> library(baad.data)
> d <- baad_data("1.0.0")
> storr:::hash_object(d[["data"]])
[1] "16e346bcc5a49c10a3974b6ac149749f"
> path <- tempfile()
> d2 <- baad_data("1.0.0", path)
  |=========================================================================================| 100%
> storr:::hash_object(d2[["data"]])
[1] "16e346bcc5a49c10a3974b6ac149749f"
dfalster commented 8 years ago

Good, that's what we get everywhere else! So why is appveyor giving a different result? (It's returning "bbb1a095d02a931e852d75e057340c71")

dfalster commented 8 years ago

And just to confirm, @richfitz was right -- there is an issue with line endings comparing windows output to linux and mac:

> d3 <- readRDS("/root/data/win2.rds") #windows fresh download
> storr:::hash_object(d3)
[1] "21aaa8f37193d849b36c7346476afa13"
> storr:::hash_object(d3[["data"]])
[1] "16e346bcc5a49c10a3974b6ac149749f"
> all.equal(d3, d)
[1] "Component “bib”: Component “Markesteijn2009”: Component “abstract”: 1 string mismatch"                                                       
[2] "Component “bib”: Component “Petritan2009”: Component “author”: Component 2: Component 1: Lengths (1, 2) differ (string compare on first 1)"  
[3] "Component “bib”: Component “Petritan2009”: Component “author”: Component 2: Component 2: 1 string mismatch"                                  
[4] "Component “bib”: Component “vanBreugel2011”: Component “author”: Component 1: Component 1: Lengths (1, 2) differ (string compare on first 1)"
[5] "Component “bib”: Component “vanBreugel2011”: Component “author”: Component 1: Component 2: 1 string mismatch"                                
[6] "Component “bib”: Component “Wang2011”: Component “author”: Component 7: Component 1: Lengths (1, 2) differ (string compare on first 1)"      
[7] "Component “bib”: Component “Wang2011”: Component “author”: Component 7: Component 2: 1 string mismatch"              
> all.equal(d3, d2)
[1] "Component “bib”: Component “Markesteijn2009”: Component “abstract”: 1 string mismatch"                        
> compare_hash_elements(d, d3)
[1] "bib"
> compare_hash_elements(d2, d3)
[1] "bib"
> compare_hash_elements(d2[["bib"]], d3[["bib"]])
[1] "Markesteijn2009"
> compare_hash_elements(d[["bib"]], d3[["bib"]])
[1] "Markesteijn2009" "Petritan2009"    "vanBreugel2011"  "Wang2011"       
> unlist(d3[["bib"]][["Wang2011"]]$author)
      given      family       given      family       given      family 
     "Feng"      "Wang"  "Mengzhen"      "Kang"        "Qi"        "Lu" 
      given      family       given      family       given      family 
"Véronique"    "Letort"       "Hui"       "Han"       "Yan"       "Guo" 
      given      family       given      family 
 "Philippe" "De Reffye"    "Baoguo"        "Li" 
> 
> all.equal(d[["data"]], d3[["data"]])
[1] TRUE
> all.equal(d2[["data"]], d3[["data"]])
[1] TRUE
dfalster commented 8 years ago

@richfitz Can I get your opinion on the solution implemented here.

A summary of the problem so far is

  1. The first problem is with the bibtex files. As you suggested, line endings are causing differences in hash for the bib element between windows and other platforms. In addition, names like "van Breugel", "De Reffye", "von Lüpke" are handled differently on linux, giving a second reason for hashes to differ.
  2. So I tried to just hash the data component. But this was behaving differently on appveyor to all other platforms. (I get same result for storr:::hash_object(baad[["data"]]) on my machine, Remko's machine, and travis). The code below shows that differences arise in columns where NA's are present, on appveyor platform compared to others.

The solution implemented was to hash just the 'data' component, after converting to character.

dfalster commented 8 years ago

By saving artefacts on appeveyor, I compared the data as loaded on appveyor to what i was getting on my machine:

compare_hash_elements <- function(x1, x2) {
   i <- sapply(x1, storr:::hash_object) == sapply(x2, storr:::hash_object) 
   names(x1)[!i]
}

compare_hash_vals <- function(x1, x2) {
   i <- sapply(x1, storr:::hash_object) == sapply(x2, storr:::hash_object) 
   x1[!i]
}

# download https://ci.appveyor.com/api/buildjobs/qjnops93iewg2xw2/artifacts/baad.data.Rcheck/tests/testthat/baad_1.0.0.rds

library(baad.data)
library(testthat)

d3 <- readRDS("~/Downloads/baad_1.0.0.rds") # from appveyor
d <- baad_data("1.0.0")  #local mac

for(x in names(d)){
   expect_identical(d[[x]], d3[[x]])
}

# note only differences noted is in the bib, due to both line endings and processing of names like "van ..., de ..."

# Now compare hashes of the 
compare_hash_elements(d, d3)

# Note hash for `data` differs
storr:::hash_object(d[["data"]])
storr:::hash_object(d3[["data"]])

# But R tells us the contents are identical
all.equal(d[["data"]], d3[["data"]])

# Looking at elements of "data" we can see that it's all the numerical values 
compare_hash_elements(d[["data"]], d3[["data"]])

# And same with "dictionary" -- it's the numerical vals that differ 
compare_hash_elements(d[["dictionary"]], d3[["dictionary"]])

# here's an example

storr:::hash_object(d[["data"]][["studyName"]])
storr:::hash_object(d3[["data"]][["studyName"]])

storr:::hash_object(d[["data"]][["latitude"]])
storr:::hash_object(d3[["data"]][["latitude"]])

# Now let's look at specific elements, seems NA's get treated differently

compare_hash_vals(d[["data"]][["latitude"]], d3[["data"]][["latitude"]])

i <- !is.na(d[["data"]][["latitude"]])
compare_hash_vals(d[["data"]][["latitude"]][!i], d3[["data"]][["latitude"]][!i])

compare_hash_vals(d[["data"]][["latitude"]][i], d3[["data"]][["latitude"]][i])

# Just to verify, let's check another variable
xvar <- "h.t"
i <- !is.na(d[["data"]][[xvar]])
compare_hash_vals(d[["data"]][[xvar]][!i], d3[["data"]][[xvar]][!i])
compare_hash_vals(d[["data"]][[xvar]][i], d3[["data"]][[xvar]][i])

# Now let's try it as a character (they're the same!)
xvar <- "h.t"
i <- !is.na(d[["data"]][[xvar]])
compare_hash_vals(as.character(d[["data"]][[xvar]][!i]), as.character(d3[["data"]][[xvar]][!i]))
compare_hash_vals(as.character(d[["data"]][[xvar]][i]), as.character(d3[["data"]][[xvar]][i]))

And here it is with output:

> compare_hash_elements <- function(x1, x2) {
+    i <- sapply(x1, storr:::hash_object) == sapply(x2, storr:::hash_object) 
+    names(x1)[!i]
+ }
> 
> compare_hash_vals <- function(x1, x2) {
+    i <- sapply(x1, storr:::hash_object) == sapply(x2, storr:::hash_object) 
+    x1[!i]
+ }
> 
> 
> # download https://ci.appveyor.com/api/buildjobs/qjnops93iewg2xw2/artifacts/baad.data.Rcheck/tests/testthat/baad_1.0.0.rds
> 
> library(baad.data)
> library(testthat)
> 
> d3 <- readRDS("~/Downloads/baad_1.0.0.rds")
d <- baad_data("1.0.0")

for(x in names(d)){
   expect_identical(d[[x]], d3[[x]])
}

# note only differences noted is in the bib, due to both line endings and processing of names like "van ..., de ..."

# Now compare hashes of the 
compare_hash_elements(d, d3)

# Note hash for `data` differs
storr:::hash_object(d[["data"]])
storr:::hash_object(d3[["data"]])

# But R tells us the contents are identical
all.equal(d[["data"]], d3[["data"]])

# Looking at elements of "data" we can see that it's all the numerical values 
compare_hash_elements(d[["data"]], d3[["data"]])

# And same with "dictionary" -- it's the numerical vals that differ 
compare_hash_elements(d[["dictionary"]], d3[["dictionary"]])

# here's an example

storr:::hash_object(d[["data"]][["studyName"]])
storr:::hash_object(d3[["data"]][["studyName"]])

storr:::hash_object(d[["data"]][["latitude"]])
storr:::hash_object(d3[["data"]][["latitude"]])> d <- baad_data("1.0.0")
> 
> 
> for(x in names(d)){
+    expect_identical(d[[x]], d3[[x]])
+ }
Error: d[[x]] not identical to d3[[x]].
Component “Markesteijn2009”: Component “abstract”: 1 string mismatch
Component “Petritan2009”: Component “author”: Component 2: Component 1: Lengths (2, 1) differ (string compare on first 1)
Component “Petritan2009”: Component “author”: Component 2: Component 2: 1 string mismatch
Component “vanBreugel2011”: Component “author”: Component 1: Component 1: Lengths (2, 1) differ (string compare on first 1)
Component “vanBreugel2011”: Component “author”: Component 1: Component 2: 1 string mismatch
Component “Wang2011”: Component “author”: Component 7: Component 1: Lengths (2, 1) differ (string compare on first 1)
Component “Wang2011”: Component “author”: Component 7: Component 2: 1 string mismatch
> 
> # note only differences noted is in the bib, due to both line endings and processing of names like "van ..., de ..."
> 
> # Now compare hashes of the 
> compare_hash_elements(d, d3)
[1] "data"       "dictionary" "bib"       
> 
> # Note hash for `data` differs
> storr:::hash_object(d[["data"]])
[1] "16e346bcc5a49c10a3974b6ac149749f"
> storr:::hash_object(d3[["data"]])
[1] "bbb1a095d02a931e852d75e057340c71"
> 
> # But R tells us the contents are identical
> all.equal(d[["data"]], d3[["data"]])
[1] TRUE
> 
> # Looking at elements of "data" we can see that it's all the numerical values 
> compare_hash_elements(d[["data"]], d3[["data"]])
 [1] "latitude"  "longitude" "map"       "mat"       "lai"       "age"      
 [7] "a.lf"      "a.ssba"    "a.ssbh"    "a.ssbc"    "a.shba"    "a.shbh"   
[13] "a.shbc"    "a.sbbh"    "a.stba"    "a.stbh"    "a.stbc"    "a.cp"     
[19] "a.cs"      "h.t"       "h.c"       "d.ba"      "d.bh"      "h.bh"     
[25] "d.cr"      "c.d"       "m.lf"      "m.ss"      "m.sh"      "m.sb"     
[31] "m.st"      "m.so"      "m.br"      "m.rf"      "m.rc"      "m.rt"     
[37] "m.to"      "a.ilf"     "ma.ilf"    "r.st"      "n.lf"      "n.ss"     
[43] "n.sb"      "n.sh"      "n.rf"      "n.rc"     
> 
> # And same with "dictionary" -- it's the numerical vals that differ 
> compare_hash_elements(d[["dictionary"]], d3[["dictionary"]])
[1] "minValue" "maxValue"
> 
> # here's an example
> 
> storr:::hash_object(d[["data"]][["studyName"]])
[1] "d5fffd896d2d985ecb4d2c9b22a0f6d7"
> storr:::hash_object(d3[["data"]][["studyName"]])
[1] "d5fffd896d2d985ecb4d2c9b22a0f6d7"
> 
> storr:::hash_object(d[["data"]][["latitude"]])
[1] "03286f419bc93e0755b4d7f48d3b7ade"
> storr:::hash_object(d3[["data"]][["latitude"]])
[1] "58350df6d4342fd7d8c020e2ba7156bc"
> 
> 
> # Now let's look at specific elements, seems NA's get treated differently
> 
> compare_hash_vals(d[["data"]][["latitude"]], d3[["data"]][["latitude"]])
 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[26] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[51] NA NA NA NA NA NA NA NA NA NA NA NA NA
> 
> i <- !is.na(d[["data"]][["latitude"]])
> compare_hash_vals(d[["data"]][["latitude"]][!i], d3[["data"]][["latitude"]][!i])
 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[26] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[51] NA NA NA NA NA NA NA NA NA NA NA NA NA
> 
> compare_hash_vals(d[["data"]][["latitude"]][i], d3[["data"]][["latitude"]][i])
numeric(0)
> 
> # Just to verify, let's check another variable
> xvar <- "h.t"
> i <- !is.na(d[["data"]][[xvar]])
> compare_hash_vals(d[["data"]][[xvar]][!i], d3[["data"]][[xvar]][!i])
   [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
  [25] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
...
> compare_hash_vals(d[["data"]][[xvar]][i], d3[["data"]][[xvar]][i])
numeric(0)
> 
> # Now let's try it as a character (they're the same!)
> xvar <- "h.t"
> i <- !is.na(d[["data"]][[xvar]])
> compare_hash_vals(as.character(d[["data"]][[xvar]][!i]), as.character(d3[["data"]][[xvar]][!i]))
character(0)
> compare_hash_vals(as.character(d[["data"]][[xvar]][i]), as.character(d3[["data"]][[xvar]][i]))
character(0)
>