RevolutionAnalytics / ravro

9 stars 11 forks source link

union of primitive and complex types #7

Open skamkar opened 8 years ago

skamkar commented 8 years ago

Greetings,

First - kind thanks for replying to the previous null issue.

In an attempt to avoid this null usage, we've rewritten the schema/data to leverage unions of primitive (string/int/etc) and complex (array/record) types. However, we're running into the following error:

Error in sch$type : $ operator is invalid for atomic vectors

This appears to happen during the sapply routine on the parsing of the union type. Below, JSON/schema pairs have been provided for an (1) int/array case and an (2) int/record case. Any insight or help would be wonderful.

Thanks in advance, Sean

INT/ARRAY case

JSON:

{
  "dataset": 3,
  "app": {"array": ["423","jim","bob"]}
}

AVSC:

{
  "name": "employee_training_dataset",
  "type": "record",
  "fields": [
    {
      "name": "dataset",
      "type": "int"
    },
    {
      "name": "app",
      "type": ["int",
        {
          "type": "array",
          "items": "string"
        }
      ]
    }
  ]
}

INT/RECORD case

JSON:

{
  "dataset": 3,
  "app": {
    "app_record": {
      "id": 4,
      "address": "main"
    }
  }
}

AVSC:

{
  "name": "employee_training_dataset",
  "type": "record",
  "fields": [
    {
      "name": "dataset",
      "type": "int"
    },
    {
      "name": "app",
      "type": ["int",
        {
          "type": "record",
          "name": "app_record",
          "fields": [
            {
              "name": "id",
              "type": "int"
            },
            {
              "name": "address",
              "type": "string"
            }
          ]
        }
      ]
    }
  ]
}
jamiefolson commented 8 years ago

So in general, we only focused on a few key use cases for unions, so you'll run into issues using a union of a primitive, like int, and a compound type like a record or array. There's nothing prohibiting it from getting done, but it's not a use case we targeted.

Jamie Olson

On Mon, Feb 1, 2016 at 1:52 PM, skamkar notifications@github.com wrote:

Greetings,

First - kind thanks for replying to the previous null issue.

In an attempt to avoid this null usage, we've rewritten the schema/data to leverage unions of primitive (string/int/etc) and complex (array/record) types. However, we're running into the following error:

Error in sch$type : $ operator is invalid for atomic vectors

This appears to happen during the sapply routine on the parsing of the union type. Below, JSON/schema pairs have been provided for an (1) int/array case and an (2) int/record case. Any insight or help would be wonderful.

Thanks in advance, Sean

INT/ARRAY case

JSON:

{ "dataset": 3, "app": {"array": ["423","jim","bob"]} }

AVSC:

{ "name": "employee_training_dataset", "type": "record", "fields": [ { "name": "dataset", "type": "int" }, { "name": "app", "type": ["int", { "type": "array", "items": "string" } ] } ] }

INT/RECORD case

JSON:

{ "dataset": 3, "app": { "app_record": { "id": 4, "address": "main" } } }

AVSC:

{ "name": "employee_training_dataset", "type": "record", "fields": [ { "name": "dataset", "type": "int" }, { "name": "app", "type": ["int", { "type": "record", "name": "app_record", "fields": [ { "name": "id", "type": "int" }, { "name": "address", "type": "string" } ] } ] } ] }

— Reply to this email directly or view it on GitHub https://github.com/RevolutionAnalytics/ravro/issues/7.