datacarpentry / archive-datacarpentry

Data Carpentry workshop materials
Other
166 stars 141 forks source link

Splitting out the R lesson #139

Closed fmichonneau closed 9 years ago

fmichonneau commented 9 years ago

Following @ethanwhite notes, I tried to split the R lesson.

Please check https://github.com/fmichonneau/R-intro-ecology and let me know if you think something is missing. I used:

git clone git@github.com:datacarpentry/datacarpentry.git R-intro-ecology
cd R-intro-ecology/
git filter-branch --prune-empty --subdirectory-filter lessons/R master

git cherry-pick --strategy=recursive -X theirs b433f2c4b444191df6b0b9382737b5f6eb3a1f40
git cherry-pick d2fe4756e20a44a814ce93d38271f4cd41828173 e23865af07f297ca0b01e968968eafebecc2e7e3 4ac75d15c7265a58f6d101a3bde3d50bf02b9924 b90e67ae94176b074916f9c10cc18e2deafb20ce c5dd7666f662b04251cee6ec4622286dee06b7c5 9aed23f920fe1abd31add5d34c76b8d1fee17ba5 dd1988ca2ff671150d1c4310b934f3c8284e3eb9 67f79c617006ee1452240fefdc3b203349cf3ca6 d9820636810b9630da26de24dddfab79a60107fc b7dbb4e983e06629f97f9949c7480634436dce55 6eedff03c209b56da83eeb3da62235c4ac3c2338 4338a548493411b92d945b829923b053141d743e 00e12d1431311825e44093c5d4677a6e21cf94d4 6182be915474187829dbb7b5e0719660671d442f e04f7e1936b82c9ba25bee37ea897db168036632 0b6ac9ec2421c0abd3639b0b84440ca14afa375c
ethanwhite commented 9 years ago

Thanks for taking the initiative to split this out @fmichonneau!

Can someone remind me of the history of this material? @tracykteal? I have this vague recollection that this was originally forked from the Software Carpentry material. If that is the case then it makes the history extraction more complicated because we should also be trying to recover the Software Carpentry history (e.g., see #119). If I'm wrong about this then this split should probably be in pretty good shape, though we may want to get the data commits cherry-picked in as well (or just plan on immediately switching over to the Portal Project Teaching Database).

ethanwhite commented 9 years ago

I did a little digging through the histories and this material does in fact fork from the Software Carpentry material. The good news is that it forks from a single isolated commit by @sarahsupp, https://github.com/swcarpentry/r-novice-inflammation/commit/2320a65d0f22331d3297425602d067ee101d53ff, which is the original addition of the material. So, this should be a fair bit simpler than extracting the history for the shell lesson.

@fmichonneau - did you want to give combining these two different sets of history a try? Off the top of my head we're probably looking at:

  1. Create an empty repo
  2. Create orphan branch with the SWC R lesson in it
  3. Cherry-pick the commit mentioned above into master
  4. Remove any files not in the initial DC commit [1]
  5. Create an orphan branch with the DC lesson in it
  6. Filter branch the DC lesson
  7. Merge the DC lesson into master
  8. Cherry pick in the organizational files (CONTRIBUTING, LICENSE, etc.)

(Note: the odds that I've foreseen all of the complexities off the top of my head are low)

If that sounds like it's manageable great. If it sounds a bit complicated just let me know and I'll be happy to tackle it.

[1] https://github.com/datacarpentry/datacarpentry/commit/c653f645ec08a34b6d45c630687825e550ddeddc

tracykteal commented 9 years ago

Thanks @ethanwhite. I do now remember that it was @sarahsupp who did this first set of lessons.

ethanwhite commented 9 years ago

Here's what I have so far:

mkdir R-ecology
cd R-ecology
git init

# SWC R lesson root
git remote add swcremote git@github.com:swcarpentry/r-novice-inflammation.git
git fetch swcremote
git cherry-pick 2320a65d0f22331d3297425602d067ee101d53ff

# Remove files and directories not in main directory of DC lesson
git rm -rf data/ 02-func-R.md  03-loops-R.md  04-cond-colors-R.md  05-testing-R.md  06-best_practices-R.md  07-knitr-R.md ggplot.pdf  guide.md  rblocks.R
git commit -m "Remove files and directories not used by Data Carpentry"

# DC R lesson
git remote add dcremote git@github.com:datacarpentry/datacarpentry.git
git fetch dcremote
git checkout dcremote/master
git checkout -b dcR
git filter-branch --prune-empty --subdirectory-filter lessons/R/ dcR
git reset --hard HEAD

echo "Merging SWC and DC material"
git checkout master
git merge dcR

For some reason the initial commit ends up as one of the most recent commits after the merge.

@wking - is there a way to end up with the history in a more logical order?

ethanwhite commented 9 years ago

After having spent time reading up on using rebase with --onto and --root I think I now have the history looking like it should.

@fmichonneau - can you take a look at: https://github.com/datacarpentry/R-ecology

when you get the chance and make sure everything looks OK. At that point we're done with this and with the full splitting out of the repos and datacarpentry can be officially put into mothballed.

hlapp commented 9 years ago

This issue was moved to datacarpentry/test#2