vega / datalib

JavaScript data utility library.
http://vega.github.io/datalib/
BSD 3-Clause "New" or "Revised" License
732 stars 133 forks source link

Groupby Nested Array #86

Closed xNok closed 7 years ago

xNok commented 7 years ago

I`m trying to find a data mining library in JavaScript that would make it easy to manipulate JSON files. I have been recommended on stackoverflow to use this one. I have to say i like the Pandas like syntax to manipulate and summarize datasheets. However there is one use case I fail to solve with this library.

You have the initial JSON given a book you know the name of the author. In the Final JSON given an author you wanna know which book he wrote.

Intuitively I was looking to groupby author. But as a result the datas are grouped by group of authors.

Does this kind of transformation supported by Datalib? Is there a work around?

 [
     {
        "title": "",
        "author": [
          {
            "given": "",
            "family": "",
            "affiliation": []
          },
          {
            "given": "",
            "family": "",
            "affiliation": []
          }
         ]
      },{
        "title": "",
        "author": [
          {
            "given": "",
            "family": "",
            "affiliation": []
          },
          {
            "given": "",
            "family": "",
            "affiliation": []
          }
         ]
      }
    ]

Full example with input and desired output

jheer commented 7 years ago

Datalib is focused on "flat" tabular data, so the groupby functionality won't be of great help in this use case with nested array data.

However, standard JavaScript array functions can meet your needs. Here's an example:

var byAuthor = books.reduce(function(authors, book) {
  book.author.forEach(function(author) {
    var name = author.family;
    var booksByAuthor = authors[name] || (authors[name] = []);
    booksByAuthor.push(book);
  });
  return authors;
}, {});