openSNP / snpr

The sources of the openSNP website
http://opensnp.org
MIT License
174 stars 46 forks source link

Add stats page #102

Closed philippbayer closed 8 years ago

philippbayer commented 10 years ago

As Samantha proposed on the mailing list, we should have a stats page showing # users, # genotypings (with percentage of each provider etc.), some graphs over time etc.

tsujigiri commented 9 years ago

...the current time between adding a genotype and parsing it... ;)

gedankenstuecke commented 8 years ago

We now have some code to generate nice graphs based on the data in openSNP. Should we just write a small cronable rake-task that gets the data and runs the R-scripts daily or so, in order to have updated statistics on the website?

philippbayer commented 8 years ago

We could even put that code into a Knitr document, so we can build a static web-site with some additional writing ("the average user enters X user-phenotypes, and has been here for X years/months")

Here's an example HTML: https://github.com/yihui/knitr-examples/blob/master/003-minimal.Rhtml Better, an example Rmarkdown: https://raw.githubusercontent.com/yihui/knitr-examples/master/001-minimal.Rmd

There's then Kumquat which is a gem that allows Rails to render Rmarkdown via Knitr... wonder whether that would be the best option? then we wouldn't need the cron/rake task, the page would be regenerated, and would always be up-to-date

gedankenstuecke commented 8 years ago

Sure, we could go really overboard and generate the graph each time, but I think that's overkill. Because in that case each reload would trigger getting all genotypes/users to generate the raw data for the graph, right?

The cron solution would just run DB queries as often as we run the cron-job. And I think doing the graph once a day, when the latest DB dump is generated, would be more than fine?

philippbayer commented 8 years ago

You're right, it would be against keeping it simple - plus we'd have one more dependency which'll break once the next Rails is out, and at some point die completely, cron is good!

gedankenstuecke commented 8 years ago

Yes, already adding R + the required libraries could be tricky at some point. But on the web-front it would be just embedding an image which is static, so it can be easily replaced and in the worst case it will not get updated.

Now I just need to figure out how we can add R + ggplot2 + reshape2 to the Docker image. Any recommendations on that, especially @tsujigiri? :smile:

tsujigiri commented 8 years ago

I don't think we actually need to add it to the image. Why not run it wherever (e.g. it's own image) and put the results on the storage box where they are pulled from by the page?

gedankenstuecke commented 8 years ago

Sure, that would work as well, but has the problem that one of us has to remember to do it regularly? :wink:

tsujigiri commented 8 years ago

We can still use cron.

gedankenstuecke commented 8 years ago

ah, ok. you mean having a separate image for running the viz scripts, sorry, I got that wrong!

gedankenstuecke commented 8 years ago

I had some success in getting it up and running with a premade R-docker image.

(btw. is there some easy way to reference issues/requests from other repos?)

tsujigiri commented 8 years ago

openSNP/snpr#102

gedankenstuecke commented 8 years ago

sweet, thanks!

gedankenstuecke commented 8 years ago

I started implementing this in a branch here. :smile: :sparkles: