Closed philippbayer closed 8 years ago
...the current time between adding a genotype and parsing it... ;)
We now have some code to generate nice graphs based on the data in openSNP. Should we just write a small cronable rake-task that gets the data and runs the R-scripts daily or so, in order to have updated statistics on the website?
We could even put that code into a Knitr document, so we can build a static web-site with some additional writing ("the average user enters X user-phenotypes, and has been here for X years/months")
Here's an example HTML: https://github.com/yihui/knitr-examples/blob/master/003-minimal.Rhtml Better, an example Rmarkdown: https://raw.githubusercontent.com/yihui/knitr-examples/master/001-minimal.Rmd
There's then Kumquat which is a gem that allows Rails to render Rmarkdown via Knitr... wonder whether that would be the best option? then we wouldn't need the cron/rake task, the page would be regenerated, and would always be up-to-date
Sure, we could go really overboard and generate the graph each time, but I think that's overkill. Because in that case each reload would trigger getting all genotypes/users to generate the raw data for the graph, right?
The cron solution would just run DB queries as often as we run the cron-job. And I think doing the graph once a day, when the latest DB dump is generated, would be more than fine?
You're right, it would be against keeping it simple - plus we'd have one more dependency which'll break once the next Rails is out, and at some point die completely, cron is good!
Yes, already adding R + the required libraries could be tricky at some point. But on the web-front it would be just embedding an image which is static, so it can be easily replaced and in the worst case it will not get updated.
Now I just need to figure out how we can add R + ggplot2 + reshape2 to the Docker image. Any recommendations on that, especially @tsujigiri? :smile:
I don't think we actually need to add it to the image. Why not run it wherever (e.g. it's own image) and put the results on the storage box where they are pulled from by the page?
Sure, that would work as well, but has the problem that one of us has to remember to do it regularly? :wink:
We can still use cron.
ah, ok. you mean having a separate image for running the viz scripts, sorry, I got that wrong!
I had some success in getting it up and running with a premade R-docker image.
(btw. is there some easy way to reference issues/requests from other repos?)
openSNP/snpr#102
sweet, thanks!
I started implementing this in a branch here. :smile: :sparkles:
As Samantha proposed on the mailing list, we should have a stats page showing # users, # genotypings (with percentage of each provider etc.), some graphs over time etc.