nnf-cbn / 2019-unconference

Organisation of the 2019 unconference
3 stars 4 forks source link

BOF: How to come up with computational best practices and drive their adoption #9

Open pvtodorov opened 5 years ago

pvtodorov commented 5 years ago

Scientific software users come from very diverse educational backgrounds. I suspect that the range in individuals’ experience will become even broader as more scientists are pushed to learn how to code by necessity. As a result, the code being written in a research setting spans a huge range from complex software systems to single scripts that analyze data.

In this session I would like to discuss how we can help develop a set of computational best practices within our centers which help us:

Many of these points are everyday practices for experienced software developers and many solutions exist in the context of the corporate world. However, the problem becomes far from trivial given the heterogeneity of code, tasks, and backgrounds we encounter in an academic setting.

I would like to connect and discuss not only how we can come up with best practices that are appropriate for our centers, but also how we can make sure users become invested and drive bottom-up adoption.

I want to share our experiences and discuss what has worked well in the past, what has been a struggle, and what we think could be improved going forward. Ultimately, I would like to see us develop and drive the adoption of computational best practices within our centers.

borisevichdi commented 5 years ago

Hi Peter! I totally agree with you and would like to support this session.

Most if not all of the problems, which we encounter today in the setting of scientific development, have already been solved for 5-20 years in software development industry. Software development and data analysis have never been easy tasks, and still, software giants do not re-write their code from scratch every time the CSO or CTO changes. So, we probably can learn something from them.

I'd be interested in joining, sharing my experience and listening to experience of others, with focus on:

This topic also resonates with #5 , to some extent.

ukirik commented 5 years ago

Relevant and interesting topic on StackExchange: https://academia.stackexchange.com/q/17781/5674

pvtodorov commented 5 years ago

Thanks @borisevichdi @ukirik @pdworzynski for engaging, offering input, and some resources.

Looking at this I see there is significant overlap with some other session suggestions. I am very interested in discussing all of these in context of the human/incentive challenges, as @ukirik pointed out on StackExchange. (Although my recent view is a bit more optimistic than the commenters in that thread :) ) It appears training, incentives, and culture are the most significant points of friction. I'd like to discuss these with regards to our respective centers, if there are any efforts to push towards improvements, and how we can be part of that.

@borisevichdi I like the idea of organized trainings and interventions! I'm interested in your vision for continuous integration for data/bioinformatics as there are significant departures from software development.

pvtodorov commented 5 years ago

Session discussion and outcomes

Midnighter commented 5 years ago

I didn't get to join the session but I wanted to mention here the possibility of setting up a Jupyter notebook gallery which could be a way to share, advertise, and explore the more one-off kind of analysis. The gallery tool also allows a forking and merging workflow to either adapt work to your own or improve the exhibited notebook in a meaningful way.

I learnt about it in a meetup and apparently they used it at Novozymes with some success.

pvtodorov commented 5 years ago

@Midnighter sounds like a good idea. Is it something we can implement under the nnf-cbn org that's holding the unconference repo since we already have a lot of people here? Perhaps a separate repo for notebooks?

Midnighter commented 5 years ago

I haven't set up one myself yet but it runs a Ruby on Rails app as well as an Apache Solr instance. So it needs some server-side work which can't be hosted here on GitHub, I'm afraid. They do provide Docker images, though, so it should be straight forward to host almost anywhere. I think the harder part will be to advertise it at the centres and keep people engaged.