Closed sje30 closed 4 years ago
One quick global comment - we need to establish early on the scope (small data science scripts) of this paper. We cannot open with a flowery and misleading statement about reproducible workflows because we don’t get anywhere near discussing scale, workflow execution, and generally more complex (real world) pipelines that use multiple containers, might interact with TB of data, and be run at scale with some cloud provider or HPC job manager or similar. We don’t want someone reading this that falls into that boot camp to roll their eyes because they intuit that the authors don’t have a clue about this scale of work.
Thanks @sje30 for the comments and pointers, and @vsoch for already pushing ideas further. I think there are some really good ideas for improvement here!
@vsoch How should we proceed? I'd suggest to merge this as is, and then we can take on the concrete suggestions made: You could do the intro (#55) and I could handle what Stephen and you discussed (working on the "[SJE:...]" stuff), including the suggestion to sacrifice the "template" rule for splitting up another long rule. Both of us can
Should we do this with HackMD's GitHub integration, so we can work simultaneously?
I agree to merge, and my preference would be to do my changes after you are finished. Don’t worry about time, I can be pretty speedy.
@sje30 @vsoch I added a few comments to document my changes, which are now in the master branch: https://github.com/nuest/ten-simple-rules-dockerfiles/pull/54#pullrequestreview-386358744
@vsoch Can you do the next iteration? IMO the structure improved thanks to you and Stephen discussing, and now we need to cut out some content to make the size manageable.
Let me know when you will do this - if you can only do it next week then I'll try to resolve some of the open issues, namely #16 #17 #18 #22
sure! I can do another pass over and PR today. Stay tuned!
hi daniel @nuest here are some edits!