adilyalcin / Keshif

Keshif - Data Made Explorable (Prototype)
https://keshif.me
BSD 3-Clause "New" or "Revised" License
458 stars 123 forks source link

Make processing json APIs/files easier. #3

Closed adilyalcin closed 7 years ago

adilyalcin commented 10 years ago

Maybe there can be some common-sense automation?

chunw commented 10 years ago

This feature will be highly appreciated.

adilyalcin commented 10 years ago

Thanks for your interest Chun. Do you have an example API in mind to use as data source? All APIs offer data in different formats and different request techniques are required to generate a complete dataset for browsing.

I had implemented a demo based on Discogs, but I think there are current issues with the public key and rate limiting (some artists have hundreds of releases and each need to be retrieved individually from API). The basic solution for JSON API based datasets is creating data tables in memory and once all data is loaded, use the custom data tables to create item list and facets. You need to manage all the data loading on the client, since Keshif does not aim to offer JSON API flexibility.

chunw commented 10 years ago

I was looking for handling JSON data generated on the fly. As you said, I can still custom keshif to parse JSON and create item lists and facets from there.

fernoftheandes commented 10 years ago

Hi Adil, IMHO, providing an example where the input is JSON, say, using a very small sample of the Ben Bederson publications, would be a terrific way to minimize "barrier to entry".

kazpsp commented 10 years ago

I think you should abstract the google docs data loading, and standarize the way that keshif wants to receive the data, then custom data loaders that adapt to that standarized data can be created. this would make your plugin more flexible I think

fernoftheandes commented 10 years ago

I strongly agree with the previous comment...Keshif is such a nice piece of work and this input issue does not do it justice.

adilyalcin commented 10 years ago

Thanks for the comments, folks!

Data loading was mostly a simple procedure. I worked on it more and I have some cool (and hot) updates.

Here's the new demo, 100 hot posts from reddit (any redditers out there?):

http://adilyalcin.github.io/Keshif/demo/reddit_hot.html

It's based on a single JSON call on a public API. This simple example shows how you can load your source tables to internal keshif data structures (arrays of data items, one table per each item type).

As far as what happens internally when you load a google doc, or a csv file, kshf.loadSource is the relevant entry point: https://github.com/adilyalcin/Keshif/blob/master/keshif.js#L246

The callback function you see is used for the reddit demo to manually create data tables from JSON.

loadSheet_Google and loadSheet_File are two functions which process different data sources accordingly. loadSheet_Memory is loosely based on csv format, similar to the latter function. Just create your arrays, first row to include column names, and store it in data variable for sheet description (we used that function for an internal demo earlier). By some debugging and trial/error, I'm sure you'll find the correct format. Maybe I can create another demo, where the source is a javascript array (not loaded from a file / google doc or json API). Yet, not very soon. Play with the Reddit demo in the meantime! (And make it load any posts from any sub_reddit and different settings (hot/new/...), or read it from the some url parameters and make it configurable by changing the link;) )

Another update: Keshif no longer requires jquery internally (except to load csv files through ajax if you want to use that). The only remaining core dependency is now d3!

I hope you like this update. If you do, tell me briefly how you are using keshif:) Shoot an e-mail to mygithubname at gmail.com [ it will help us understand more use cases:) ]

kazpsp commented 10 years ago

very nice ! i will take a look. thanks for considering the request. great job !

fernoftheandes commented 10 years ago

Indeed...very nice! Thanks Adil! I will be taking a look at that as well. We are starting to offer author-centric services in my organization and I wanted to be able to pull author+publication data from an xml repository, transform it to json and then feed it into Keshif. So, this will help me out with the POC before the executives.

FavazFarook commented 10 years ago

Hi Adil the URL http://adilyalcin.github.io/Keshif/demo/reddit_hot.html does not work for me, any idea why?

Fav

adilyalcin commented 10 years ago

There were some errors on that demo, it's fixed now. You can also specify subreddits in url now through #hash like #r/funny and so on.

mryellow commented 10 years ago

Until finding this issue it would be easy to assume that only Google Docs are supported. However only after looking at the source. Nothing on the readme hints at it being powered by google docs, actually thought it was just d3 lib.