OuhscBbmc / data-science-practices-1

Collection of publicly available practices of data science and analysis
https://ouhscbbmc.github.io/data-science-practices-1/
MIT License
3 stars 6 forks source link

Invitation to participate in 'book' #3

Open wibeasley opened 6 years ago

wibeasley commented 6 years ago

@DavidBard, @thomasnwilson @genevamarshall, @mand9472, @arpeters, @caston, @athumann, @sbohora, @mhunter1, @yutiantang, @Cliff, @andkov, @higgi13425

I'm describing the practices created & followed by our data science projects. Most of you have been a part of these teams, and a subset might be interested in reading or adding to this documentation.

My goal is to make it easier to

  1. train new people as they join our team
  2. be consistent within our existing team
  3. communicate to outsiders what we do, including
    1. OUHSC IT's security reviewers
    2. grant reviewers assessing the quality of our team
    3. future publications of our methods (eg, the MIECHV dashboards for nurses)

The draft of chapters and sections are current at this link. As the organization of chapters & files changes, this link may break. https://ouhscbbmc.github.io/data-science-practices-1/scratch-pad-of-loose-ideas.html#chapters-sections

The overall book is at https://ouhscbbmc.github.io/data-science-practices-1/

I've structured it as a book out of convenience. I'm not currently planning to publish it formally, but I guess we shouldn't rule it out. I need to think about it some more, but I'm leaning towards a Creative Commons Attribution-NonCommercial-ShareAlike License.

If there's a section or chapter you're interested in writing, you're welcome to any degree of authorship. There's not much to look at now. In the future, feel free to make small edits by clicking this button: image

geauxdojang commented 6 years ago

Great! Thanks..

DavidBard commented 6 years ago

Great idea, @wibeasley ! I might add a chapter on data harmonization now that this is a major focus of our MIECHV 2 work. As you and I discussed recently, a chapter on this would benefit from a review of other harmonization approaches in addition to the one we have currently adopted. I've not seen much out there on the subject (haven't seriously looked for it, though), so this on its own could be a publishable product. Happy to help and follow your lead on that chapter.

andkov commented 6 years ago

@wibeasley , this is wonderful, thanks for including me! Here's what comes to mind as I skim the current (2018-10-24) TOC.

Chapter 8. I wonder if automation of graph production is too granular in this context. A recent practice I have developed formalizes any graph as a collection of functions: 1) prepare data 2)making graph 3) printing graph 4) convenient wrapper. (see https://github.com/andkov/ipdln-2018-hackathon/blob/master/scripts/graphing/graph-factory-shell.R for an example of a template).

Chapter 11. I don't see any reference to charting the workflow and function dependencies. Creating an approachable workflow may be the magic trick to make a project more approachable. For example: https://github.com/IHACRU/suppress-for-release