TomKellyGenetics / studyGroup

Other
3 stars 1 forks source link

Ideas for future SYSKA topics #4

Open TomKellyGenetics opened 9 years ago

TomKellyGenetics commented 9 years ago

From @TomKellyGenetics on September 1, 2015 23:18

Hi team, I thought this would be a good space to start planning ideas for SYSKA topics to gauge interest in various things we could do... and find volunteers for topics we'd like to see. Many of these packages I don't use very much but would like to learn, maybe giving a presentation would be a good push to try them out? Of course many of you are likely tobe experienced with some of these and likely more equipped to present it if you'd like to. While you're welcome to come up with your own topics for your presentations, I think it would be good to plan some in advance so people know what they're coming to, especially if we're eventually reaching out into the wider research community.

Topics I could do: Data handling in R: data.table, bigmem, data-wrangling (reshape, tidyr, custom codes) Statistics in R: linear models (although many others here could do a better job) Installing and managing packages in R (CRAN, Bioconductor, GitHub, SourceForge) Mathematica / Wolfram Alpha

Topics I would like to see: R: ggplot(2), dplyr, and other Hadleyverse packages R projects (with Git/Github) LaTeX intro/recap (Authorea?) Using with databases with R Using Unix tools to manipulate files (awk, sed, grep, cat, head, tail, etc...) Getting help: using Linux/R forums effectively Handling sequence data

Topics I'd recommend for new(ish) students: Accessing Servers in the terminal: ssh (alias, keys), scp, rsync - backups and running code HPC in R or Python (local parallel, departmental servers, or NeSI) NeSI: submitting and managing jobs with slurm

Copied from original issue: smilefreak/studyGroup#7

TomKellyGenetics commented 9 years ago

From @nickb- on September 1, 2015 23:29

Once my study life calms down, I'd be very happy to do something on R + databases. In particular, SQL Server 2016 will come with R built-in, providing a great platform for in-database analytics. Would be able to get my hands on an early release of SQL Server 2016 and give it a whirl.

Nick

TomKellyGenetics commented 9 years ago

From @smilefreak on September 2, 2015 2:27

Hi Tom, Nick

All those suggestions sound great! I think aswell as that we should try some interactive visualisation things, people really like those, and some papers now actually provide you with a way to publish interactive plots in the online version of a paper.

For R Shiny is the obvious choice. But lot's more exist for instance in python we have http://stanford.edu/~mwaskom/software/seaborn/ Could we get a cancer genetics heatmap working that package, and once people are past the begineer stage in python it would be good to introduce them to the visualisation. Nothing will beat R in terms of proper statistics but is important to keep you options open, as R is not the best tool for ever job.

Obviously expanding into a larger group we should be thinking of things like teach some C ( everyone should know some basic C), GPU programming, matlab, and nltk (more humanities here)

TomKellyGenetics commented 9 years ago

From @smilefreak on September 2, 2015 2:27

Also, I will make a label for discussion threads so that we know it is a discussion.

TomKellyGenetics commented 9 years ago

Yep, there should be plenty of interest in interactive graphics with R: shiny, ggvis, and d3.heatmap are all possible topics. Would take a bit more preparation though so maybe better to save these for a wider audience?

Could see if I can remember MATLAB but probably better to collaborate with Physics on that one. I'd also be interested in Python or MATLAB visualisations for networks if we can get someone to run through that. A CirCos demo would be cool too.

TomKellyGenetics commented 9 years ago

From @smilefreak on September 2, 2015 8:37

Circos, yea forgot about that, I used it a little bit in my apple work, maybe pencil that in for a later date. Will be good to learn it in enough detail to teach it also, that will be my benefit.

TomKellyGenetics commented 9 years ago

From @methylnick on September 7, 2015 20:27

On the thread of visualisation you should follow Martin Kryswinski's Blog (maker of Circos), if you haven't come across it already, the visualisation principles he preaches are very interesting and sound. http://mkweb.bcgsc.ca/

TomKellyGenetics commented 9 years ago

From @methylnick on September 10, 2015 3:52

Kind of a SYSKA I would propose is the regulatory constraints around genomic testing. Happy to provide a high level overview on experience from Melbourne and a little with my organisation. I can send a discussion paper along with it.

TomKellyGenetics commented 9 years ago

From @murraycadzow on October 8, 2015 20:36

Thinking of be able to encourage people whom might feel they don't have enough programming background to talk about software these are some ideas that could be accessible:

How people use resources that already exist such as: genome browsers like ensembl or UCSC 1000 genomes encode various cancer datasets plant datasets