ropensci / elastic

R client for the Elasticsearch HTTP API
https://docs.ropensci.org/elastic
Other
245 stars 58 forks source link

Unexpected loss of digits in docs_bulk #279

Closed cphaarmeyer closed 3 years ago

cphaarmeyer commented 3 years ago

Not sure if this is a bug or intended. It caused me some problems.

When using docs_bulk_index on a data frame, only 4 digits actually appeared in the Elasticsearch. This was due to a call of jsonlite::toJSON in make_bulk_ which only keeps 4 digits by default. Maybe set digits = NA?

In my case I resolved this by writing my own custum function to create a bulk upload textfile. But maybe someone else has a similar problem and I didnt find an issue on this here.

sckott commented 3 years ago

Share your R session info and ideally a reproducible example

cphaarmeyer commented 3 years ago

Edited out the connection Info.

library(elastic)

conn <- connect(...)

conn$es_ver()
#> [1] 561

index_create(conn, index = "tmp")
#> $acknowledged
#> [1] TRUE
#> 
#> $shards_acknowledged
#> [1] TRUE
#> 
#> $index
#> [1] "tmp"

df <- data.frame(x = 1.23456789)

docs_bulk_index(conn, df,
  index = "tmp", type = "tmp", es_ids = FALSE, doc_ids = 1
)
#>   |                                                                              |                                                                      |   0%  |                                                                              |======================================================================| 100%
#> [[1]]
#> [[1]]$took
#> [1] 26
#> 
#> [[1]]$errors
#> [1] FALSE
#> 
#> [[1]]$items
#> [[1]]$items[[1]]
#> [[1]]$items[[1]]$index
#> [[1]]$items[[1]]$index$`_index`
#> [1] "tmp"
#> 
#> [[1]]$items[[1]]$index$`_type`
#> [1] "tmp"
#> 
#> [[1]]$items[[1]]$index$`_id`
#> [1] "1"
#> 
#> [[1]]$items[[1]]$index$`_version`
#> [1] 1
#> 
#> [[1]]$items[[1]]$index$result
#> [1] "created"
#> 
#> [[1]]$items[[1]]$index$`_shards`
#> [[1]]$items[[1]]$index$`_shards`$total
#> [1] 2
#> 
#> [[1]]$items[[1]]$index$`_shards`$successful
#> [1] 1
#> 
#> [[1]]$items[[1]]$index$`_shards`$failed
#> [1] 0
#> 
#> 
#> [[1]]$items[[1]]$index$created
#> [1] TRUE
#> 
#> [[1]]$items[[1]]$index$status
#> [1] 201

docs_get(conn, index = "tmp", type = "tmp", id = 1)
#> $`_index`
#> [1] "tmp"
#> 
#> $`_type`
#> [1] "tmp"
#> 
#> $`_id`
#> [1] "1"
#> 
#> $`_version`
#> [1] 1
#> 
#> $found
#> [1] TRUE
#> 
#> $`_source`
#> $`_source`$x
#> [1] 1.2346

index_delete(conn, index = "tmp")
#> $acknowledged
#> [1] TRUE

Created on 2021-02-15 by the reprex package (v1.0.0)

Session info ``` r sessioninfo::session_info() #> - Session info --------------------------------------------------------------- #> setting value #> version R version 4.0.3 (2020-10-10) #> os Windows 10 x64 #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate German_Germany.1252 #> ctype German_Germany.1252 #> tz Europe/Berlin #> date 2021-02-15 #> #> - Packages ------------------------------------------------------------------- #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0) #> backports 1.2.1 2020-12-09 [1] CRAN (R 4.0.3) #> cli 2.3.0 2021-01-31 [1] CRAN (R 4.0.3) #> crayon 1.4.0 2021-01-30 [1] CRAN (R 4.0.3) #> crul 1.0.0 2020-07-30 [1] CRAN (R 4.0.2) #> curl 4.3 2019-12-02 [1] CRAN (R 4.0.0) #> digest 0.6.27 2020-10-24 [1] CRAN (R 4.0.3) #> elastic * 1.1.0 2020-01-11 [1] CRAN (R 4.0.3) #> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.0) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2) #> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2) #> highr 0.8 2019-03-20 [1] CRAN (R 4.0.0) #> htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 4.0.3) #> httpcode 0.3.0 2020-04-10 [1] CRAN (R 4.0.0) #> jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.0.3) #> knitr 1.31 2021-01-27 [1] CRAN (R 4.0.3) #> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.0) #> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.0.3) #> pillar 1.4.7 2020-11-20 [1] CRAN (R 4.0.3) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.0) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.0) #> R6 2.5.0 2020-10-28 [1] CRAN (R 4.0.3) #> Rcpp 1.0.6 2021-01-15 [1] CRAN (R 4.0.3) #> reprex 1.0.0 2021-01-27 [1] CRAN (R 4.0.3) #> rlang 0.4.10 2020-12-30 [1] CRAN (R 4.0.3) #> rmarkdown 2.6 2020-12-14 [1] CRAN (R 4.0.3) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.0) #> stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.2) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.0) #> styler 1.3.2 2020-02-23 [1] CRAN (R 4.0.0) #> tibble 3.0.6 2021-01-29 [1] CRAN (R 4.0.3) #> triebeard 0.3.0 2016-08-04 [1] CRAN (R 4.0.0) #> urltools 1.7.3 2019-04-14 [1] CRAN (R 4.0.0) #> vctrs 0.3.6 2020-12-17 [1] CRAN (R 4.0.3) #> withr 2.4.1 2021-01-26 [1] CRAN (R 4.0.3) #> xfun 0.20 2021-01-06 [1] CRAN (R 4.0.3) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0) #> #> [1] C:/Users/philipp/Documents/R/win-library/4.0 #> [2] C:/Program Files/R/R-4.0.3/library ```
sckott commented 3 years ago

@cphaarmeyer should work now. reinstall remotes::install_github("ropensci/elastic"), and see digits param https://docs.ropensci.org/elastic/reference/docs_bulk_index.html