american-art / ima

Indianapolis Museum of Art
Other
6 stars 3 forks source link

aac-objects.json format not proper #14

Closed mit2nil closed 7 years ago

mit2nil commented 7 years ago

Hi,

aac-objects.json file is formatted in such a way that all objects are represented as a single json object which creates a sparse table with thousands of columns and very few rows. This makes it impossible to do modeling per column modeling needs to be done for every object separately.

Below is the current format of the data, { "count": XXX, "data": { "id1": { "propertyname1": propertyvalue1, "propertyname2": propertyvalue2, ... }, "id2": { "propertyname1": propertyvalue1, "propertyname2": propertyvalue2, ... }, ... } }

Even though it is a valid json file, it is not a valid tabular representation of the data. Using an array is the logical choice here. Below is the format used in aac-actors.json file which satisfied proper tabular representation while being a valid json. { "count": XXX, "data": [ { "id": idvalue, "propertyname1": propertyvalue1, "propertyname2": propertyvalue2, ... }, { "id": idvalue, "propertyname1": propertyvalue1, "propertyname2": propertyvalue2, ... }, ... ] }

You can try to load the current and updated data files into Karma to check how they are formatted.

workergnome commented 7 years ago

@mit2nil, I think that's a pretty common pattern for JSON files to take, particularly ones with primary keys. Is this a limitation of Karma that we need to work around?

IllyaMoskvin commented 7 years ago

I thought it was pretty odd too, but I wanted to keep the previously submitted structure intact. I see that you guys uploaded a tabular representation of the data. Should this issue be closed then?

caknoblock commented 7 years ago

Yes, this format is legal, but not supported by Karma. We have written a script for this dataset that converts it into a JSON format that is supported.