Open choldgraf opened 7 years ago
want to store this in the software WG google drive folder so that it syncs automatically? Or use an online service rather than freemind?
Do you know of a good one? I looked around quickly, but didn't see anything obvious.
@stefanv where is that text outline that we put together? Can you push that to the repo so that we can start iterating on that as a TOC?
The mindmap is the hierarchical version of that. I will re-map into text format.
ah yep, just saw that...I moved it into a google drive that I put in the BIDS fellows folder. That folder is here:
https://drive.google.com/drive/folders/0B8VZ4vaOYWZ3bGg5QzlBZWJZZVU?usp=sharing
Original:
*** Field Guide to DS
***** A quick guide to organizing computational biology projects
***** Structure
* Data organization (data formats & where to store, scripts vs
interactive, scripts go with data, versioning data,
intermediate data, etc.)
* Reproducibibility (software, papers)
* Revision control
* Continuous integration
* Software documentation
* Truthful visualization (colormaps, elements of graphics,
misleading plots, etc.)
* Managing large datasets
* Choosing a language
* Communication
* Organizing a lab
* Managing computational science projects (for broader use)
* Data scaling challenging (in-memory, out-of-memory,
parallelization, clusters, etc.)
* Pre-publication
* Exploratory analysis (keeping track of what you try, learning
focused exploration, breaking up exploration into chunks, etc.)
* Open vs closed publishing
* Licensing
* Scoping a project (realistic expectation + time estimates)
* How to collaborate on GitHub / contribute to existing packages
(perhaps section on getting your feet wet)
* Resources for finding answers to questions; how long do you
keep trying before asking / looking elsewhere
* Resources: storing data, code, doing computations, public
clusters, etc.
* Sharing your work in public: figshare, open publication,
how to publish a dataset, how to publish software (licensing)
* How to re-use other people's work (licensing, forking,
contributing back, etc.)
* Scientific workflows
* Managing a project and working with people
* Data curation
* Software development
* Virtual environments
******* Broader principles
******* Reproducibility
******* Software
I think the mindmap is a good start for now actually...I think the next question is where are we missing information, and what will we prioritize for inclusion in a "v1" version of this.
Maybe we can take a week to add topics to this mindmap (the one in the gdrive) as we see fit, and then meet next week to make a "first cut" of topics to include?
ps I just set up a gitter for this project, tho we could use slack instead if you prefer. WDYT?
Shall we put the following in the doc, and then just iterate on it there?
- Overarching themes:
- Reproducibility
- Software
- Papers
- Provenance Tracking
- Topics
- Data Management & Organization
- Data Versioning
- Data backup & replication
- Data access
- Databases
- Online storage
- S3
- Data/computation scaling
- In / out-of-memory
- Parallelization
- Clusters
- Curation
- Cleaning
- Software
- Revision control
- Continuous integration
- Choosing the right language
- Licensing, re-use, and attribution
- Contributing to existing projects (see also GitHub collaboration)
- Virtual Environments
- Representation
- Visualization
- Elements of Graphics
- Misleading plots
- Colormaps
- Effective plots
- Communication
- Organizing a lab
- Online communications
- Managing computation science projects
- GitHub collaboration
- Working with people
- Publication
- Pre-publishing
- Open access
- Indexing, identification, DOI, ORCID, etc.
- Figshare and other sharing platforms
- Experimentation / research planning
- Tracking
- Learning focused collaboration
- Chunking work
- Scoping an entire project
- Finding help
- Where to find help
- How long to wait before
- Research Workflows
Re: slack/gitter, I am on both, although Slack notifications are more visible on Android.
Advantage to Gitter: others can join us from outside.
https://www.dropbox.com/s/oku7bhoucgvdpic/Field%20Guide%20to%20Data%20Science.mm?dl=0
This mindmap can be opened with freemind (
brew install freemind
)