Closed romainFr closed 4 years ago
Where is this happening?
As far as I could tell only in neuprint_find_neurons
, so I just pushed b765c96426cc294f7fe9a58db9ad418a40155426 to fix it.
The only really safe base data type for a body id is actually character vector. Numeric vectors always have the potential to lose precision during conversion to character (which happens when you make the json representation). In particular they have ~ 53 bits of precision compared with the full 64 bit range of an integer id. I've been kind of hoping to avoid running into this as it will be a pain to deal with.
Agreed. There might be a functions returning numeric vectors, so we should take a look at it.
I think we do need to convert them to numeric when passing them to the cyphers through jsonlite::toJSON(as.numeric(unlist(bodyids)))
though -- the alternative would be to replace "toJSON" by a custom built string
The biggest integer that we need to worry about is 2^64-1 = 9223372036854775806. This cannot be represented as a numeric.
# no good because we need a json integer
> jsonlite::toJSON("9223372036854775806")
["9223372036854775806"]
# no good because loses precision (even if you persuade it not to print in scientific form)
> jsonlite::toJSON(9223372036854775806)
[9.22337203685478e+18]
The best way is to use 64 bit integers via the bit64 package. You basically need to do this:
# character input
> jsonlite::toJSON(bit64::as.integer64("9223372036854775806"))
[9223372036854775806]
# actual bit64 input
> jsonlite::toJSON(as.integer64(2)^64-1)
[9223372036854775806]
So we could have an internal function that does the following
Then takes the result and jsonifies it.
@romainFr Just a note that I'll take this.
This discussion is now closed by #30.
Not sure when that changed, but I find that having the bodyids as factors (for example in the results of neuprint_find_neurons) is generally a bad idea as it cannot be passed to other functions directly :
as.numeric
will transform them to the factor index, not the actual numeric bodyid. The only required change would be usingstringAsFactor = FALSE
when creating thedata.frame
s