ajschumacher / rjstat

read and write JSON-stat with R
Other
31 stars 6 forks source link

My rough ideas on making a jsonstat object #14

Closed MansMeg closed 8 years ago

MansMeg commented 8 years ago

Hi @hmalmedal and @ajschumacher !

Here is a small pull request with the ideas I have on how to extend the usage of the rjstat package. I suggest to have a new json-stat object and from this it is possible to inherit the json stat type classes (dataset etc.).

This makes it possible to work directly with jsonstat object, print them and add methods or relevance. It is also possible to check that the json object is a valid json-stat object (now this validation is really light-weight - hopefully we can use a json schema validator later on if it gets implemented). My next step was to implement a "get_data.frame" method or similar using the parse_dataset function.

What do you think?

hmalmedal commented 8 years ago

You propose a new API that can co-exist with the old API? I think that is a sound approach.

MansMeg commented 8 years ago

Alright! Now I'm done with my first additions and all tests passes. As you mention @hmalmedal the old API is still working. I've added a new API where one can work directly with jsonstat objects since this is of interest for me right now. The data.frame parsing is though plugged in to parse_dataset() function.

Some example code you could try out if you would like to see what I have added to the package:

js <- as.jsonstat("http://json-stat.org/samples/oecd.json")
js
js[1,2:5,]
js[[1,2:5,]]
as.data.frame(js[1,2:5,])
as.array(js[1,2:5,])
js[1,1,1] <- 100
as.data.frame(js[1,2:5,])

A testsuite for the new functionality is added as well.

hmalmedal commented 8 years ago

Looks nice.

Something I found difficult when I worked on this, was to keep track of which elements that should use unbox. Just using auto_unbox won't work.

MansMeg commented 8 years ago

Hi! Thanks!

Thats a good suggestion. I guess that problem should be in the as.character() method? I have a unit test for this and it does not fail?

hmalmedal commented 8 years ago
> dataset <- as.jsonstat("tests/testthat/dataset.json")
> as.character(dataset)
{
  "version": "2.0",
  "class": "dataset",
  "label": "A dataset",
  "value": 1,
  "id": "testdimension",
  "size": 1,
  "dimension": {
    "testdimension": {
      "label": "A dimension",
      "category": {
        "index": "testcategory",
        "label": {
          "testcategory": "A category"
        }
      }
    }
  }
} 

Everything is unboxed.

Compare with the original file:

{
  "version": "2.0",
  "class": "dataset",
  "label": "A dataset",
  "value": [
    1
  ],
  "id": [
    "testdimension"
  ],
  "size": [
    1
  ],
  "dimension": {
    "testdimension": {
      "label": "A dimension",
      "category": {
        "index": [
          "testcategory"
        ],
        "label": {
          "testcategory": "A category"
        }
      }
    }
  }
}
MansMeg commented 8 years ago

Alright! Thanks! Ill put that into a unit test!

How do you solve this? Do you have a function to "box" correctly?

MansMeg commented 8 years ago

I now have this problem as a failing unit test. Thanks again!

hmalmedal commented 8 years ago

You have to use the unbox function from jsonlite on the list elements in question.

http://search.r-project.org/library/jsonlite/html/unbox.html

MansMeg commented 8 years ago

Great! I now fixed the unboxing (a little tricky) by creating a unbox_jsonstat() function. this can be improved further on using real inheritage. Now mostly a workaround.

hmalmedal commented 8 years ago

Good. Let me know when you think it is ready for merging.

MansMeg commented 8 years ago

Sure! I'll be working some more today. I'll let you know.

MansMeg commented 8 years ago

Alright @hmalmedal ! Now you can merge the pull request. I won't work much more the next couple of days.