hafen / htmlwidgetsgallery

65 stars 98 forks source link

discussion/feedback #1

Open timelyportfolio opened 9 years ago

timelyportfolio commented 9 years ago

This is great. Let me know how I can help. Only 9 for me? on week 26 :)

hafen commented 9 years ago

@timelyportfolio, you're on top of things! I pushed this out earlier today and didn't have time to comment on it.

I had mentioned this approach a while back, here https://github.com/ramnathv/htmlwidgets/issues/63, building on work by @jpmarindiaz, and finally found a few minutes to play around with the idea yesterday. And then of course as with any diversion I got sucked into trying to get something presentable together today :). Hopefully I'm not duplicating effort with anyone else right now, but I had some ideas and wanted to get things to a state where we can discuss. At this point, @timelyportfolio, @ramnathv, @jjallaire, @jcheng5, @jpmarindiaz, your feedback / help would be great.

Take a look here http://hafen.github.io/htmlwidgetsgallery to see what it's like.

The way I've worked it out here, to submit a widget, the idea would be to simply add the appropriate entry in _config.yml and add a screenshot and send a PR. Whoever the gallery curators are can merge and it's done. This should keep work of curators to a minimum while still allowing for quality control.

In no particular order, here are some things that could use some help and feedback. Would be good to deal with many of these before it's ready for the public:

If you have issues with this approach in general, let me know. If this approach is the way to go, here are some questions:

Finally, here are the meta data entries and what they are supposed to contain, as it is now:

What else or is some of this unneccessary? JS lib authors? Author's full name?

hafen commented 9 years ago

Oh and I should mention that the purpose of this particular site is a gallery of htmlwidget packages, for widget discovery, etc. Perhaps gallery isn't the right term. The idea of a gallery of plots, etc. created with htmlwidgets, that's a cool idea too, but not what I'm doing here.

jjallaire commented 9 years ago

Fantastic!!!!!

Some quick feedback:

1) I think we should make the thumbnail itself be a click target to navigate to the web page for the widget (the link we have now is too subtle) . The extra metadata could be in a "More info" or other chevron style link in the footer.

2) The biggest concern I have about an unfiltered list of every widget we've ever heard of is that many, many widgets represent a few days work and then are abandoned. We need to find a way to distinguish between "production" widgets and ones "in the lab".

Some suggested attributes to look for to promote widgets to higher prominence: on CRAN, actively maintained (recent commits, resolved issues, etc.), pass R CMD check --as-cran, have a website documenting their use, handle dynamic sizing correctly, etc.).

We could probably get away with just two categories ("labs" being for less complete work) whereby a few of us informally agree that if 3 of us give the thumbs up something can go into the "production" bin.

On Sat, Jul 11, 2015 at 1:29 AM, hafen notifications@github.com wrote:

Oh and I should mention that the purpose of this particular site is a gallery of htmlwidget packages, for widget discovery, etc. Perhaps gallery isn't the right term. The idea of a gallery of plots, etc. created with htmlwidgets, that's a cool idea too, but not what I'm doing here.

— Reply to this email directly or view it on GitHub https://github.com/hafen/htmlwidgetsgallery/issues/1#issuecomment-120575847 .

jpmarindiaz commented 9 years ago

@hafen Awesome!

Some comments:

Great work, happy to be a curator/maintainer

timelyportfolio commented 9 years ago

Perhaps,this registry/gallery for biojs http://biojs.io/ will give us some ideas. @hafen, I believe it automatically pulls in forks/stars from Github.

ramnathv commented 9 years ago

This is looking great @hafen ! I would think about going beyond the gallery by adding the following components:

  1. A gallery of plots created using htmlwidgets with reproducible code.
  2. A blog that allows widget authors to write about their widgets.
  3. A google spreadsheet or similar mechanism where authors can report widgets they are working on.

I have some suggestions on the gallery itself and will post them shortly.

hafen commented 9 years ago

Thanks for the feedback. @jjallaire @jpmarindiaz, good idea about the production / lab categorization. I didn't realize that was the purpose of "status" so that makes sense now. Instead of status: stable/alpha, it should probably be production: true/false or incubator: true/false. The initial view of the page would omit the non-production widgets.

Metrics for giving prominence are great. I wonder if to capture all of these in an automated way we will have to move to something more customized than jekyll updating the page for us with each PR.

@jjallaire, I wonder if we could come up with a quantitative approach for what is required for moving to the production bin - if we can agree on some rules and spell it out, we can avoid having to deal with people thinking we are playing favorites or something.

ramnathv commented 9 years ago

@hafen I think being on CRAN and having a well documented website should be enough to start with. The simpler we keep the tagging guidelines, the easier it is for us to avoid controversies.

jjallaire commented 9 years ago

I agree. On CRAN + dedicated website with enough simple examples to get users started should be the criteria.

On Mon, Jul 13, 2015 at 10:57 AM, Ramnath Vaidyanathan < notifications@github.com> wrote:

@hafen https://github.com/hafen I think being on CRAN and having a well documented website should be enough to start with. The simpler we keep the tagging guidelines, the easier it is for us to avoid controversies.

— Reply to this email directly or view it on GitHub https://github.com/hafen/htmlwidgetsgallery/issues/1#issuecomment-120957921 .

hafen commented 9 years ago

@ramnathv, adding a gallery of reproducible plots would be a great complementary addition - and would be a good piece of data to display as a link in the widget registry (how many examples the widget has and a link to them in the gallery). This might address @jpmarindiaz's issue with one thumbnail not being sufficient.

To do this, something along the lines of bl.ocks.org would be good. I recall @timelyportfolio has done some stuff with this with htmlwidgets before. We might even be able to use bl.ocks.org itself. In general, publishing a gist of R code along with all the html/js necessary to make the plot would be sufficient. An additional htmlwidgets R function (probably in a different package) to publish to such a gallery would make this very easy and make it easier for the gallery to be quickly populated.

By the way, putting something like this together would probably require an offline build system / server for the gallery.

hafen commented 9 years ago

@ramnathv @jjallaire how many htmlwidgets are currently on CRAN? I'd assume it's not a large number.

jcheng5 commented 9 years ago

@hafen Looks like 12 or 13 (reverse imports/suggests): http://cran.r-project.org/web/packages/htmlwidgets/index.html

timelyportfolio commented 9 years ago

Also, I think it would be nice to use some of the htmlwidgets to draw networks of htmlwidgets using meta-information such as JavaScript dependencies, co-authorship, tags that @hafen has included.

@hafen perhaps you are thinking about loryR for multi-image carousels or something similar as the htmwidget thumbnail.

Selfishly I'm not crazy about the CRAN-requirement, but I can't think of any other ways to filter/qualify, I currently view all htmlwidgets as experimental and would say few are production quality in the traditional sense (hopefully I'm not offending anyone).

ramnathv commented 9 years ago

@hafen I had put together a bl.ocks equivalent for rCharts along with a custom publishing function that used a git backend. I could revive that in the htmlwidgets context. It it important to have a custom viewer, since our focus is the R code and not the index.html. I will send a PR to htmlwidgets that wraps up this feature.

jjallaire commented 9 years ago

I think the CRAN requirement is a helpful filter because it indicates commitment (of both time and ongoing support/enhancement) and quality (all the things required to pass R CMD check). We don't have anything else nearly as objective and cut and dried and clearly we need some criteria to avoid htmlwidgets gaining the reputation of being incomplete, buggy, etc.

On Mon, Jul 13, 2015 at 4:31 PM, timelyportfolio notifications@github.com wrote:

Also, I think it would be nice to use some of the htmlwidgets to draw networks of htmlwidgets using meta-information such as JavaScript dependencies, co-authorship, tags.

@jcheng5 https://github.com/jcheng5 perhaps you are thinking about loryR http://www.buildingwidgets.com/blog/2015/5/14/week-19-loryr-slider for multi-image carousels or something similar as the htmwidget thumbnail.

Selfishly I'm not crazy about the CRAN-requirement, but I can't think of any other ways to filter/qualify, I currently view all htmlwidgets as experimental and would say few are production quality in the traditional sense (hopefully I'm not offending anyone).

— Reply to this email directly or view it on GitHub https://github.com/hafen/htmlwidgetsgallery/issues/1#issuecomment-121047564 .

ramnathv commented 9 years ago

I would agree with @jjallaire. While all htmlwidgets are at some level experimental, passing R CMD CHECK ensures that the package author has spent time taking care of some basic stuff at the very least.

timelyportfolio commented 9 years ago

Yes, R CMD CHECK helps filter out some junk, but it is fairly easy to pass R CMD CHECK and still be junk on the JavaScript side. Very little of the code/effort in many of my htmlwidgets is R. Perhaps, we can come up with some checklist such as this Wiki A Good htmlwidget that could help insure quality.

ramnathv commented 9 years ago

I see your point @klr. R CMD CHECK only ensures that the R code works and has been documented appropriately. It is very well possible for the JS code to be buggy and I think we should be careful on that front.

One way to circumvent this issue is to only tag the gallery with facts like On CRAN, 30+ Stars etc. This way, we will let the end-user be the judge of what widgets they way to use. While we could come up with a complicated checklist of what the widget should satisfy, I think it is very hard for us to systematically and objectively evaluate each widget against it.

Hence, I suggest that we restrict gallery tags to fact based ones and let the user be the judge. In my mind the gallery is mainly for widget discovery and shared code and NOT an endorsement of any of the widgets.

timelyportfolio commented 9 years ago

Much prefer @ramnathv suggestion of facts-based system On Cran rather than arbitrary potentially opinionated labels, such as production, experimental, dead. I believe much of the quality, attentiveness of the developer, functionality, etc. will shine through rather quickly in the gallery.

hafen commented 9 years ago

Fact based sounds good. But then perhaps to avoid htmlwidgets getting a reputation of being sloppy, there could be a default filtering when the page is viewed that shows the widgets with the "best" facts. An example of this is what I did with sorting by github stars by default. This is something I did not want to do, but otherwise some of the less complete widgets showed up at the top which isn't good.

If we go fact based, we could still do QC on the end of whether a package even makes it into the registry. I think at a minimum passing R CMD CHECK should be a requirement there (to make sure documentation is there, etc.).

jjallaire commented 9 years ago

Based on this discussion I think we should do the following:

1) Two categories, CRAN and "Under Development" (or whatever other term seems appropriate)

2) Order both categories by GitHub stars.

3) Whomever is running the registry reserves the right to exclude a widget from either category if it's really shoddy. Obviously this would only occur for situations of really poor quality widgets (whether on CRAN or not).

3 might be controversial, but is really implicit in any public list of

widgets (whoever manages the list can include/exclude whatever they wish). As long as we explain to a widget author why they are excluded I think it can still be a process that is fair to all.

On Mon, Jul 13, 2015 at 5:25 PM, hafen notifications@github.com wrote:

Fact based sounds good. But then perhaps to avoid htmlwidgets getting a reputation of being sloppy, there could be a default filtering when the page is viewed that shows the widgets with the "best" facts. An example of this is what I did with sorting by github stars by default. This is something I did not want to do, but otherwise some of the less complete widgets showed up at the top which isn't good.

If we go fact based, we could still do QC on the end of whether a package even makes it into the registry. I think at a minimum passing R CMD CHECK should be a requirement there (to make sure documentation is there, etc.).

— Reply to this email directly or view it on GitHub https://github.com/hafen/htmlwidgetsgallery/issues/1#issuecomment-121064413 .

hafen commented 9 years ago

@ramnathv, your custom bl.ocks.org feature would be awesome. If you could point me to the branch for this feature, that would be great. To serve these, do you need a custom web server? Just thinking about how to integrate it with the widget gallery.

hafen commented 9 years ago

@jjallaire I just pushed a few changes that add a "CRAN only" switch that is turned on by default when the page is loaded, and also widgets are sorted by github stars by default as well. That should cover (1) and (2). For (3), it would be great if someone could take a pass right now and propose widgets that should not be there right now.

Also, I know I am missing several widgets. For example, I found pairsD3 and visNetwork on CRAN that aren't in _config.yml. Could someone please go through and add any widgets you think should be there?

I also updated the thumbnail and widget name links to point to the widget home page instead of opening up the detail. Now only clicking on the 3 vertical dots activates the detail view.

jjallaire commented 9 years ago

This is looking fantastic!

I don't have directly experience with rhandsontable or svgPanZoom. Does anyone else?

On Thu, Jul 16, 2015 at 3:13 PM, hafen notifications@github.com wrote:

@jjallaire https://github.com/jjallaire I just pushed a few changes that add a "CRAN only" switch that is turned on by default when the page is loaded, and also widgets are sorted by github stars by default as well. That should cover (1) and (2). For (3), it would be great if someone could take a pass right now and propose widgets that should not be there right now.

Also, I know I am missing several widgets. For example, I found pairsD3 and visNetwork on CRAN that aren't in _config.yml. Could someone please go through and add any widgets you think should be there?

I also updated the thumbnail and widget name links to point to the widget home page instead of opening up the detail. Now only clicking on the 3 vertical dots activates the detail view.

— Reply to this email directly or view it on GitHub https://github.com/hafen/htmlwidgetsgallery/issues/1#issuecomment-122054655 .

timelyportfolio commented 9 years ago

Yes, @hafen this is really nice. I'll volunteer to add a couple that I know are missing.

@jjallaire I have experience with both.

timelyportfolio commented 9 years ago

Also, what are thoughts on releasing/publicizing this? I would love for it to get all the attention it deserves plus more.

hafen commented 9 years ago

I'd love to release as soon as possible, but first do the following:

timelyportfolio commented 9 years ago

Would it not just go on http://htmlwidgets.org?

timelyportfolio commented 9 years ago

@hafen, would this link http://www.carsonshold.com/2014/05/github-metadata-with-jekyll-and-javascript/ help with the dynamic Github stars?

hafen commented 9 years ago

I'm all for putting it on htmlwidgets.org. Is that served with github pages? Or would you just do some DNS stuff to point htmlwidgets.org/gallery (or should we be using registry?). Some style work with the header would have to be done to match. Wherever it goes, I'm completely happy with the idea of it being owned by someone else (e.g. the htmlwidgets team) and I would just contribute with PRs.

hafen commented 9 years ago

@timelyportfolio, thanks for the link on getting github metadata. I tried a JS approach and unfortunately those count as unauthenticated api requests from the client side and you get rate limited before the page finishes loading.

About the only other approach I can think of other than moving to a dedicated non-gh-pages server is to set up some REST endpoint on something cheap like digitalocean that periodically pulls the _config.yml, iterates through the packages, grabs the github meta data using a github auth token to avoid rate limiting, and makes these available via REST calls from the page. Then we can get stuff like # issues, # closed issues, etc. This can all be put together pretty easily with R. There's got to be a more simple way though. But I'd really like to have this meta data.

Perhaps @gaborcsardi has some ideas or experience here with his work on metacran.

ramnathv commented 9 years ago

@hafen AFAIK there is no trivial way to use the github api with authentication on the client side without exposing the secret and key. One solution is to periodically grab the metadata and cache it as json on the github pages site. This way, you can use it as a fallback when the API limit gets hit. This should give a near-real-time experience. I would be happy to prototype what I am talking about and push it. Let me know.

hafen commented 9 years ago

@ramnathv that's a great idea. I had started down this path, working on an R script to grab the meta data, but stopped at the first problem. I just committed a script I was working on to populate this: https://github.com/hafen/htmlwidgetsgallery/blob/gh-pages/scripts/github_meta.R. If you could prototype your idea that would be awesome.

jjallaire commented 9 years ago

Yes, htmlwidgets.org is indeed served with github pages. We could also serve it on any server/backend we like and use gallery.htmlwidgets.org (that's what we do with the Rcpp Gallery).

On Fri, Jul 17, 2015 at 12:30 AM, hafen notifications@github.com wrote:

I'm all for putting it on htmlwidgets.org. Is that served with github pages? Or would you just do some DNS stuff to point htmlwidgets.org/gallery (or should we be using registry?). Some style work with the header would have to be done to match. Wherever it goes, I'm completely happy with the idea of it being owned by someone else (e.g. the htmlwidgets team) and I would just contribute with PRs.

— Reply to this email directly or view it on GitHub https://github.com/hafen/htmlwidgetsgallery/issues/1#issuecomment-122168039 .

gaborcsardi commented 9 years ago

Wow, this looks great!

As for the GitHub API issue, I don't have much to add. Essentially you either do it "offline", independently of the client JS, and regularly push it to GH, or you can set up a simple proxy on digitalocean for $5 per month. The proxy is super simple, it would probably take me an hour to set up and write in node, and I am not a very proficient node programmer. :) You could cache things in redis, so that it is smoother, and you can make requests in parallel, at least from node, but probably also from the browser.

A slightly different approach is to make a server that holds the metadata in some DB, and periodically updates it. This is almost the same, but you also need some proper DB. The gain is that you never need to go to GH (slow), only to your server (fast).

Of course with the DO server there is a small maintenance cost, and maybe something like Heroku or Redhat Openshift, or any other PaaS is better. OpenShift is free for the first three small machines. (I am not affiliated with them in any way.)

Let me know if you need help with this.

gaborcsardi commented 9 years ago

Btw. if you want to go the "offline updater" way, RedHat Openshift is also excellent for that. They have a simple CRON service, and you can run a script every minute. That's how MetaCran's crandb is updated.

hafen commented 9 years ago

@gaborcsardi thanks for the pointers. I had looked at digitalocean and heroku looking for something completely free, so I'll check out OpenShift for sure. Does your cron job push to a github repo or expose a database? I was thinking that in the bigger scheme of things a broader meta data repository for all github R packages would be something we could make use of here.

@ramnathv along these lines, I've pushed an R script that gets the meta data and a github_meta.json file. So we just need to update the page to read this file and populate the appropriate field. Let me know if you'd like to do that and if you can do it soon, otherwise I'll take a stab at it and may ask you to check my code. I think for now the plan will be for me to set up an automated periodic update of the json file.

ramnathv commented 9 years ago

@hafen This is looking good. I will be able to get to this only next week. So if you want it done before that, go ahead and take a crack and I can add any comments/feedback. If next weeks is good, let me know and I will do the needful.

I also think it makes sense to explore the route that @gaborcsardi is proposing since it is more automated and avoids the need to run any scripts locally.

hafen commented 9 years ago

@ramnathv thanks - I actually just updated it - was easier than I thought https://github.com/hafen/htmlwidgetsgallery/commit/3f9bb24b7c1324250c554d2270df0ba91902ccbd. The idea is not to run the script locally in an ad hoc manner but to set up a cron job on a server that runs the R script every hour or so and pushes the updated json. Does that sound reasonable? That's the only part remaining to do.

ramnathv commented 9 years ago

Yes. That sounds reasonable @hafen.

gaborcsardi commented 9 years ago

@hafen My cron job pushes to a DB. You can set secret environment variable on openshift for the DB password. It also has a nice heroku-like git-based workflow. Here is a script: https://github.com/metacran/cron/blob/master/.openshift/cron/minutely/update-crandb.r

hafen commented 9 years ago

Cool! Thanks @gaborcsardi. What does your javascript look like where you are pulling from couchdb? Are you doing this in node or in the browser?

gaborcsardi commented 9 years ago

@hafen It is node. E.g. https://github.com/metacran/metacranweb/blob/master/lib/recent.js#L9 (Somewhat complicated by the caching I do in Redis.)

But couchdb has an HTTP API, so you can use it from any language without a client lib.

I should warn you that it is also extremely simple, queries are one round of map-reduce, and this sucks. If you want a proper DB, Mongo is a better choice probably. If you just want to use it as a "cache", that is fine.

hafen commented 9 years ago

Thanks @gaborcsardi.

Since at this point it's a simple problem and a small amount of data, I decided to take the approach of a CRON job running a script to update a json file in the repo every hour or so.

timelyportfolio commented 9 years ago

Thanks so much @hafen. Now that you have collected all of this we can use htmlwidgets to analyze it :)

http://bl.ocks.org/timelyportfolio/e591b7c5360633e136d7

hafen commented 9 years ago

@timelyportfolio - that's awesome!

I looked at the stars increase across all packages over the weekend after it kind of got out on twitter and it looks like a lot of stars were added in the course of a day or two. So it looks like it's already serving its purpose (assuming the increase was due to gallery views). Star increases were mainly for CRAN packages, probably due to the bias of only showing them on load.

hafen commented 9 years ago

As far as "release" is concerned, even though it's already out, I suppose the outstanding issues from what I listed before are:

For the second one, for now I think I'll leave it as is and remove the vertical dots that show the blank card where more meta data is supposed to be. Can revisit that later. But I'm happy to accept thoughts or PRs that deal with this issue.

For the first one, if people are happy with it on something like gallery.htmlwidgets.org, let me know what's needed to get the blessing to go forward with that. Probably good at a minimum for others to go through the yml and decide whether everything there is worthy of being there.

hafen commented 9 years ago

@timelyportfolio by the way, I like your use of bl.ocks.org - how did you get only the R script to show up? Since the infrastructure is already there, perhaps we should just use this for adding a plot gallery to the page. I suppose the only issue with this is I don't think bl.ocks.org can selectively show certain gists - it's all or nothing, right?

ramnathv commented 9 years ago

To show R code in bl.ocks, one will have to make it a part of the README.md. Note that the R code does not get syntax highlighted. I implemented a viewer for rCharts, which you can see here

http://rcharts.io/viewer

It includes syntax highlighting of R code along with a setup for disqus comments.

jjallaire commented 9 years ago

To get gallery.htmlwidgets.org up and running I think a few things are required:

1) You need to add the appropriate CNAME file to your gh-pages repo

2) I need to point the DNS entry of gallery.htmlwidgets.org to the appropriate github address

In case you haven't done this before here are the details: https://help.github.com/articles/setting-up-a-custom-domain-with-github-pages/

Once this is up and running I'll also add links from the main www.htmlwidgets.org pages to the Gallery.

On Sat, Jul 11, 2015 at 9:29 AM, timelyportfolio notifications@github.com wrote:

Perhaps,this registry/gallery for biojs http://biojs.io/ will give us some ideas. @hafen https://github.com/hafen, I believe it automatically pulls in forks/stars from Github.

— Reply to this email directly or view it on GitHub https://github.com/hafen/htmlwidgetsgallery/issues/1#issuecomment-120618151 .