Development Research in Practice: The DIME Analytics Data Handbook. By Kristoffer Bjärkefur, Luíza Cardoso de Andrade, Benjamin Daniels, and Maria Jones
I am thankful for the opportunity to share our feedback as part of the final review (#476) and I appreciate the effort DIME is putting in disseminating these valuable guidelines and resources.
The chapter addresses a crucial part of a project's success: collaboration. Here are some ideas, especially coming from the angle of the Data Partnership. I'd be more than happy to collaborate.
It would be beneficial to have additional step-by-step examples on how to set up the many recommendations on the chapter. More can be found at DIME Wiki, but the intended audience might find helpful to have quick guides or more references to tutorials.
The book touches on a super important point when it comes to team communication and decisions. However, the section might need elaboration. Using tools like GitHub or Dropbox won't help much unless the team adopts an effective approach to project management. For example Agile, Agile-like, Scrum, Kanban. Of course, GitHub does support amazing features like GitHub Projects that can dramatically improve the team's performance (and sanity). In a nutshell, what's important here is not the tool, it is the process.
Probably out of scope, but it would be great to have a section on cloud computational environments and resources, such as JupyterHub, AWS Sagemaker or Google Colab.
Probably out of scope, but Python is a dispensable part of a modern analytics stack and there are considerations that might be useful when using Python or, more specifically, working on a data science project.
Probably out of scope, same goes for containerization with Docker.
I am thankful for the opportunity to share our feedback as part of the final review (#476) and I appreciate the effort DIME is putting in disseminating these valuable guidelines and resources.
The chapter addresses a crucial part of a project's success: collaboration. Here are some ideas, especially coming from the angle of the Data Partnership. I'd be more than happy to collaborate.
Ideas
absolute paths
causes trouble. It will almost guarantee your code won't run on computer other than yours. https://github.com/worldbank/dime-data-handbook/blob/ba0105d6a9a3f779abbb7026e723db8bdecaf792/chapters/2-collaboration.tex#L79It is recommended to check the Bank's stance on Dropbox. Alternatively, the Bank supports OneDrive with the advantage, other than being official and offering up to 5TB per account, of ensuring data classification (Official Only, Confidential, Strictly Confidential). https://github.com/worldbank/dime-data-handbook/blob/ba0105d6a9a3f779abbb7026e723db8bdecaf792/chapters/2-collaboration.tex#L98-L101
It would be beneficial to have additional step-by-step examples on how to set up the many recommendations on the chapter. More can be found at DIME Wiki, but the intended audience might find helpful to have quick guides or more references to tutorials.
The book touches on a super important point when it comes to team communication and decisions. However, the section might need elaboration. Using tools like GitHub or Dropbox won't help much unless the team adopts an effective approach to project management. For example Agile, Agile-like, Scrum, Kanban. Of course, GitHub does support amazing features like GitHub Projects that can dramatically improve the team's performance (and sanity). In a nutshell, what's important here is not the tool, it is the process.
https://github.com/worldbank/dime-data-handbook/blob/ba0105d6a9a3f779abbb7026e723db8bdecaf792/chapters/2-collaboration.tex#L140-L187