hbz / oerworldmap

OER World Map
https://oerworldmap.org/
Other
30 stars 16 forks source link

Implement CSV export #398

Closed philboeselager closed 8 years ago

philboeselager commented 9 years ago

task for #35

philboeselager commented 9 years ago

Starting implementation, some detail questions appear:

  1. Should all fields be exported, including @type and @context?
  2. Deep structures should be exported with full path column names, shouldn’t they?
  3. If "yes" on 2., then a path separator is necessary for deep structures. The more special this separator is, the higher the safety of avoiding conflicts with appearance of this separator in the exported data's properties names. According to the widely accepted Google Json naming conventions http://google-styleguide.googlecode.com/svn/trunk/jsoncstyleguide.xml?showone=Property_Name_Format#Property_Name_Format , special chars like the following (amongst others) should not be used in Json properties: §%/*~#-|> Personally, I'd go for the intuitive >, or something similar (>>, -> etc.).
acka47 commented 9 years ago
  1. Should all fields be exported, including @type and @context?

I'd say @type is relevant, @context not for csv.

BTW, the W3C candidate recommendations for tabular data on the web might be of interest here...

philboeselager commented 9 years ago

A bigger bang results from retrofitting of columns on previously export entries. Given:

Person 1:

{
  "name" : [ {
    "@value" : "Annie Doe"
  } ]
}

... Person n:

{
  "name" : [ {
    "@value" : "John Mae"
  } ,
  {
    "@value" : "John Doe"
  } ]
}

Assuming, we export full path column names, these could be: name>0>@value and name>1>@value. The longer name array of Person n would imply that all of the previous export lines would have to be retrofitted (and given the additional column).

Now this could end up a very badly scaling algorithm. My suggestion is to "scan" the complete export data before actually exporting it, so that a missing / empty columns can immediately be "retrofitted" (by the insertion of a ;).

literarymachine commented 9 years ago

I'd say @type is relevant, @context not for csv.

+1

literarymachine commented 9 years ago

Deep structures should be exported with full path column names, shouldn’t they?

Maybe we should go with something like this, then we would have a fixed numer of columns:

ID, type, name, address, provides
{urn:uuid}, "Name A, Name B", "Some street, some locality, some country", {urn:uuid}
philboeselager commented 9 years ago

Looks like there are 3 possible strategies:

1. Just boxing deep information into one column.

Advantages:

Disadvantages:

Example [1] :

{
  "authorOf" : [ {
    "name" : "My Book Nr. 1"
  },
  {
    "name" : "My Book Nr. 2",
    "mentions" : [ {
      "name" : "Mentioned Item 1"
    },
    {
      "name" : "Mentioned Item 2"
    } ]
  }
}

2. Only export IDs of nested objects. Gather multiple ID's in one cell for arrays. (like described by @literarymachine directly above)

Advantages:

Disadvantages:

It is to be defined here:

3. Export one column for each sub field" like described in my comment, dated Oct. 7th, 10:36 above.

Advantages:

Disadvantages:

Though I would probably agree to go for Felix' suggestion, we should very precisely think about, what's the use case of data exports:

Just let us make sure we don't build anything that the community is not going to need.

literarymachine commented 9 years ago

(Exporting multiple types in one go makes no sense due to different columns per type.)

I was thinking about this too and would argue that we actually could add all columns and only populate those that are appropriate. At least for a first take. Or we could say that we only supply exports by type for now.

we should very precisely think about, what's the use case of data exports

ping @trugwaldsaenger!

philboeselager commented 9 years ago

Talked to @trugwaldsaenger today. Due to missing knowledge about use cases, we tend to offer diverse CSV export variants. So, having realised a "variant 3", I would next try to set up a "variant 2" exporter.

literarymachine commented 9 years ago

Sounds good. But even better would be to specify some use cases for the exports, no?

philboeselager commented 9 years ago

For sure.

philboeselager commented 8 years ago

The two versions implemented so far are branched in https://github.com/hbz/oerworldmap/tree/task/%23398_variant2 and https://github.com/hbz/oerworldmap/tree/task/%23398_variant3 . @literarymachine : I'm going to rename one of the classes and merge the two branches, OK?

literarymachine commented 8 years ago

I'm going to rename one of the classes and merge the two branches, OK?

Yes, please!