colearendt / tidyjson

Tidy your JSON data in R with tidyjson
Other
182 stars 14 forks source link

Getting 'argument "json.column" is missing' error, with no explanation why, cannot find troubleshooting documentation #119

Closed ajsnyder closed 4 years ago

ajsnyder commented 4 years ago

Code I have entered: result <- jsonlite::fromJSON("testdata.json", simplifyDataFrame = TRUE) result %>% spread_all

Error I receive:

Error in eval(assertion, env) : argument "json.column" is missing, with no default

The only semi relevant issue I've found through Google is here, but is dated as of 2017. Is dplyr compatibility still an issue?

colearendt commented 4 years ago

Hello! Thanks for sharing!! Can you share:

It's hard to say what issue you're running into without that sort of information. The reprex can help create a "reproducible example." This article will help explain how to do so, if you haven't before:

https://www.jessemaegan.com/post/so-you-ve-been-asked-to-make-a-reprex/

That said, I believe you need to as.tbl_json() before you try using the tidyjson package. That may be your issue. i.e. result %>% as.tbl_json() %>% spread_all()

ajsnyder commented 4 years ago

Version: tidyjson v0.2.4 Sample of testdata.json [ { "teams": ["West Virginia Mountaineers", "Texas Longhorns"], "commenceTime": 1582588800, "homeTeam": "Texas Longhorns", "lastUpdate": 1582518362, "spread": "5.5", "kenPom": [ { "team": "West Virginia Mountaineers", "match": { "target": "West Virginia", "rating": 0.6470588235294118, "data": { "rank": "7", "team": "West Virginia", "conference": "B12", "record": "19-8", "adjustedEfficiency": "+6.14", "adjustedOffense": "107.1", "adjustedDefence": "84.4", "adjustedTempo": "69.6", "luck": "-.032", "opponentOffense": "106.5", "opponentDefense": "95.7" } } }, { "team": "Texas Longhorns", "match": { "target": "Texas", "rating": 0.47058823529411764, "data": { "rank": "68", "team": "Texas", "conference": "B12", "record": "16-11", "adjustedEfficiency": "-0.09", "adjustedOffense": "103.1", "adjustedDefence": "92.4", "adjustedTempo": "66.1", "luck": "+.066", "opponentOffense": "106.0", "opponentDefense": "97.1" } } } ], "inpredictable": [ { "team": "West Virginia Mountaineers", "match": { "target": "West Virginia", "rating": 0.6470588235294118, "data": { "rank": "9", "team": "West Virginia", "conference": "B12", "record": "18-8", "genericPointsFavored": "15.5", "adjustedOffense": "6.1", "adjustedDefence": "9.4", "projectedWins": "20.726", "pastStrengthOfSchedule": "8.16153846153846", "futureStrengthOfSchedule": "10.75" } } }, { "team": "Texas Longhorns", "match": { "target": "Texas", "rating": 0.47058823529411764, "data": { "rank": "69", "team": "Texas", "conference": "B12", "record": "16-11", "genericPointsFavored": "7.4", "adjustedOffense": "-0.7", "adjustedDefence": "8.1", "projectedWins": "17.2528", "pastStrengthOfSchedule": "5.52086122266845", "futureStrengthOfSchedule": "12.85" } } } ] } ]

Reprex:

library(tidyjson)
#> 
#> Attaching package: 'tidyjson'
#> The following object is masked from 'package:stats':
#> 
#>     filter
workingDirectory <- "~/Desktop/R/LumpySportsPrincess"
setwd(workingDirectory)
result <- jsonlite::fromJSON("testdata.json")
colnames(result)
#> [1] "teams"         "commenceTime"  "homeTeam"      "lastUpdate"   
#> [5] "spread"        "kenPom"        "inpredictable"

result_flat <- result %>%
  gather_array %>%                                     # stack the users 
  spread_all %>%  
  select(teams) # select only what is needed
#> Error in eval(assertion, env): argument "json.column" is missing, with no default

Created on 2020-03-07 by the reprex package (v0.3.0)

ajsnyder commented 4 years ago

Reprex of result %>% as.tbl_json() %>% spread_all():

result_flat <- result %>% as.tbl_json() %>% spread_all()
#> Error in result %>% as.tbl_json() %>% spread_all(): could not find function "%>%"

Created on 2020-03-07 by the reprex package (v0.3.0)

colearendt commented 4 years ago

Thanks for sharing! Please note that could not find function "%>%" is an error message that means you have not loaded a package that uses the "pipe" operator. Please try the later reprex's after library(dplyr) or library(tidyjson). You have to load packages into the reprex before you can use functions provided by them 😄

i.e.

library(tidyjson)
#> 
#> Attaching package: 'tidyjson'
#> The following object is masked from 'package:stats':
#> 
#>     filter
workingDirectory <- "~/Desktop/R/LumpySportsPrincess"
setwd(workingDirectory)
result <- jsonlite::fromJSON("testdata.json")
colnames(result)
#> [1] "teams"         "commenceTime"  "homeTeam"      "lastUpdate"   
#> [5] "spread"        "kenPom"        "inpredictable"

result_flat <- result %>%
  as.tbl_json() %>%
  gather_array %>%                                     # stack the users 
  spread_all %>%  
  select(teams)
ajsnyder commented 4 years ago

Reprex (this time with the libraries loaded):

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyjson)
#> 
#> Attaching package: 'tidyjson'
#> The following object is masked from 'package:dplyr':
#> 
#>     bind_rows
#> The following object is masked from 'package:stats':
#> 
#>     filter
workingDirectory <- "~/Desktop/R/LumpySportsPrincess"
setwd(workingDirectory)
result <- jsonlite::fromJSON("testdata.json")
result_flat <- result %>% as.tbl_json() %>% spread_all()
#> Error in eval(assertion, env): argument "json.column" is missing, with no default

Created on 2020-03-07 by the reprex package (v0.3.0)

colearendt commented 4 years ago

Oops. Yeah, it looks like this error messaging could probably be cleaned up a bit. Try reading the file with as.tbl_json() instead of using jsonlite::fromJSON().

library(tidyjson)
#> 
#> Attaching package: 'tidyjson'
#> The following object is masked from 'package:stats':
#> 
#>     filter

packageVersion("tidyjson")
#> [1] '0.2.4'

json_str <- '[ { "teams": ["West Virginia Mountaineers", "Texas Longhorns"], "commenceTime": 1582588800, "homeTeam": "Texas Longhorns", "lastUpdate": 1582518362, "spread": "5.5", "kenPom": [ { "team": "West Virginia Mountaineers", "match": { "target": "West Virginia", "rating": 0.6470588235294118, "data": { "rank": "7", "team": "West Virginia", "conference": "B12", "record": "19-8", "adjustedEfficiency": "+6.14", "adjustedOffense": "107.1", "adjustedDefence": "84.4", "adjustedTempo": "69.6", "luck": "-.032", "opponentOffense": "106.5", "opponentDefense": "95.7" } } }, { "team": "Texas Longhorns", "match": { "target": "Texas", "rating": 0.47058823529411764, "data": { "rank": "68", "team": "Texas", "conference": "B12", "record": "16-11", "adjustedEfficiency": "-0.09", "adjustedOffense": "103.1", "adjustedDefence": "92.4", "adjustedTempo": "66.1", "luck": "+.066", "opponentOffense": "106.0", "opponentDefense": "97.1" } } } ], "inpredictable": [ { "team": "West Virginia Mountaineers", "match": { "target": "West Virginia", "rating": 0.6470588235294118, "data": { "rank": "9", "team": "West Virginia", "conference": "B12", "record": "18-8", "genericPointsFavored": "15.5", "adjustedOffense": "6.1", "adjustedDefence": "9.4", "projectedWins": "20.726", "pastStrengthOfSchedule": "8.16153846153846", "futureStrengthOfSchedule": "10.75" } } }, { "team": "Texas Longhorns", "match": { "target": "Texas", "rating": 0.47058823529411764, "data": { "rank": "69", "team": "Texas", "conference": "B12", "record": "16-11", "genericPointsFavored": "7.4", "adjustedOffense": "-0.7", "adjustedDefence": "8.1", "projectedWins": "17.2528", "pastStrengthOfSchedule": "5.52086122266845", "futureStrengthOfSchedule": "12.85" } } } ] } ]'

writeLines(json_str, "testdata.json")
read_from_file <- as.tbl_json("testdata.json")

read_from_file %>% 
  gather_array() %>%
  spread_all()
#> # A tbl_json: 1 x 6 tibble with a "JSON" attribute
#>   `attr(., "JSON"… document.id array.index commenceTime homeTeam lastUpdate
#>   <chr>                  <int>       <int>        <dbl> <chr>         <dbl>
#> 1 "{\"teams\":[\"…           1           1   1582588800 Texas L… 1582518362
#> # … with 1 more variable: spread <chr>

Created on 2020-03-08 by the reprex package (v0.3.0)