lzim / teampsd

Team PSD is using GitHub, R and RMarkdown as part of our free and open science workflow.
GNU General Public License v3.0
9 stars 23 forks source link

August Task: Git/GH Desktop & Github workflow map MVP #1425

Closed lijenn closed 4 years ago

lijenn commented 4 years ago

Repo Hypothesis: Repos are organized to 
 a) reduce total number of repos 
b) based on whether they share a CI workflow c) account for compliance issues
d) integrate w/ZenHub


Final List of 7 Repositories:


  1. lzim/teampsd (Public) 
#1436
  2. lzim/mtl (Public) 
#1422
  3. "va/mtl" (GH Enterprise) (operations pipeline)

  4. lzim/research (formerly lzim/mtl_code) (Private) 
#1424
  5. "va/research" (GH_Enterprise) (research pipeline)

  6. lzim/team_tracker (Private)

 [August_epic]
  7. lzim/sim #1430

Platforms to think about:

1423 Team PSD 2.0 organization flow map (HQ)

https://app.lucidchart.com/documents/edit/d2399f12-a16c-439d-9d3d-7e7410633ea5/0_0


What environment variables and folder structure will we use? Proposed Environment Variables & Folders

  1. readme.md - will be main page you see on the repo explaining what the repo has and how they work
  2. .gitignore - for temp files you want to ignore
  3. license.md - outlines what can and cannot be done with our code (there are different license templates on GitHub that help you figure out what kind you want)
  4. code-of-conduct.md - similar to what James created for the lzim/mtl repo CoC under wiki (outlines how interaction should be conducted i.e. no trolling etc.)
  5. fonts folder - for teampsd look and feel
  6. css folder - for teampsd colors
  7. images folder- for teampsd logos
  8. .yml config file - sets parameters and initial settings in environment for running an app
  9. docs folder - explains how actions work?
  10. tests folder - folder of test scripts within the scripts folder
  11. scripts folder - tools
  12. packages folder - multiple tools brought together (scripts build the package)
  13. vignettes - as a subset nested under packages
  14. .github folder - configure issue and pull request templates. have example of “good first issue”
  15. .workflow folder - needed for GH actions
  16. secrets - encrypted environment variables using a libsodium sealed box to that it is encrypted before reaching GitHub - note for a repository secret, only the repository owner can create this
  17. gh-pages branch - GH Actions for Bookdown Manuals
  18. api
  19. web - probably only for mtl.how repo
anthonycpichardo commented 4 years ago

All Repos would follow the same Github Flow, style guides, conventions, and be part of the "Team PSD" (Name subject to change) Organization as opposed to "lzim".

Proposed Repo Design: GH Enterprise Repos Github.com Repos
Modeling To Learn - This would store the operations code that we use to support teams MTL - This repo would support Modeling to Learn and contain the content in describe_learners, detail facilitators, and SIM/Model Code
Team PSD Research - This would hold the backend code that needs to stay internal for the (R01, IIR, R21) Team PSD Research (private) - This would hold the analysis for the (R01, IIR, R21) and disseminate science
. TeamPSD - This repo would contain the content and track the work that the team uses internally i.e. Bookdown, document_team, depend_products
. team_tracker (private) - VA Clinical & Field Work

Rationale/Hypothesis: The reasoning for adding additional repositories is to separate the repository into distinct chunks that I believe make navigating projects, knowing where to contribute, and managing content simpler and more effective for users.

@lzim @staceypark @anazariz @dkngenda @jamesmrollins @lijenn @dlkibbe Let me know what people think about this proposed organization and repository structures. Any issues that may arise if it's split apart in this way or potential room for error.

jamesmrollins commented 4 years ago

@anthonycpichardo I think this is a great idea! In fact, we (DEV and I) was going to propose a Sim UI repo design next week, that would help us develop an industry standard CI/CD for Epicenter. This would build on progress we've made with automating the Vensim conversion and update process.

lzim commented 4 years ago

Cross-ref Teams > design_teampsd_2.0 Channel @lijenn @staceypark @anthonycpichardo @anazariz @dkngenda @jamesmrollins

The Teams history is informative

5/21 - "teampsd2.0_week3_mvp_hypotheses started"

PI User Summary of the r21_r01_iir_procedures Prototype @staceypark Assumption that .md files on mtl_code within each study branch within each procedure of that study. R Notebook with .Rproj environment set up with Git for version control on the repo.

Next Steps and Key Assumptions: R Notebooks integrate documentation and SQL and R code, which would integrate within our GitHub workflow using Git version control See the "way forward" posted on Quant Design Meeting started, which Anthony will post.


@staceypark is actually moving .md from one place to another (OSF to R), so I don't think she should continue.

The key there is that she was trying to set it up with the mtl_code repo. So, separate from VINCI the next "most root" thing is GitHub, Git (version control), and how that will work for research. Much like @lijenn's vignette, it would be great for all our vignettes on either team_psd or mtl_code to be set up on GitHub in a standard way that makes them easy to use across the team.

3 & 4: R21 are the TOP priorities so every step of prototyping should help us get to the bottom of the R21 code base using distributed GitHub code review that integrates within existing Team PSD GH workflow and helps maintain version control (Git).

jamesmrollins commented 4 years ago

@anthonycpichardo @anazariz @lijenn @staceypark @branscombj as promised. Below is our proposal for the Sim UI CI/CD pipeline. FYSA @hirenp-waferwire

1. Current Process

Currently Team PSD is using GitHub to manage all source code, documents etc. for version control system of same. We are using only one branch i.e. “Master” branch to manage versioning of all kinds of documents/code.

The diagram below shows the existing setup of the repository.

image

2. Proposed Sim UI CI/CD Process

As per current setup of MTL project and new requirement to automate deployment work on different Epicenter projects, we should change current repository setup.

Following is the proposed Sim UI repository setup.

image

1. Repository: We should create separate repository to manage source code versioning for each environment. This should be a new repository. 2. Branches. We should create separate protected branches for each environment. a. Dev/QA Branch: This branch will be used for QA at developer end before releasing changes for User Acceptance Test (UAT). b. Stage/UAT Branch: This branch will be used for user acceptance testing (other workgroup lead review). Client will verify changes before changes pushed to production environment. c. Production/Master: This branch will have final version of the code which will be available on production environment.

CD auto code deployment. After completing review process, the branch owner of the respective branch will push the update to the next level branch (i.e., DEV/QA to Stage/UAT). The push action will trigger an automatic code deployment process and deploy the code in the respective Epicenter project. In the case of a model file, the model will be converted to machine readable binary and deployed via SFTP to support running in Epicenter.

Advantages.

  1. We will have separate code base (version) for each environment i.e. Dev, Stage and Production.
  2. All code will be public facing allowing for contributions.
  3. We can have a mandatory review and approval process before code changes pass to next branch i.e. Dev->Stage->Production
  4. User will not need to put file manually to any of the Epicenter projects.
dlkibbe commented 4 years ago

Hi @jamesmrollins - a couple of quick questions since I've been out of the loop with dad:

  1. CI/CD Process - please define CI/CD for me
  2. relative to your comment under CD auto code deployment, "the branch owner of the respective branch..." -- can you explain scenarios in which there will be different "branch owners?
jamesmrollins commented 4 years ago

Hi @jamesmrollins - a couple of quick questions since I've been out of the loop with dad:

  1. CI/CD Process - please define CI/CD for me
  2. relative to your comment under CD auto code deployment, "the branch owner of the respective branch..." -- can you explain scenarios in which there will be different "branch owners?

Hi @dlkibbe .

  1. CI/CD means continuous integration and continuous deployment or delivery. This means that development activities such as coding are being merged to a shared mainline several times per day. Continuous Deployment means the code is tested, verified and deployed to its point-of-use as a continuous process.
  2. The branch owner would be the person who has authority to push the code by merging the branches. Likely, either the platform manager, or me, in the case of the Sim UI repository.
anthonycpichardo commented 4 years ago

Hi @jamesmrollins - a couple of quick questions since I've been out of the loop with dad:

  1. CI/CD Process - please define CI/CD for me
  2. relative to your comment under CD auto code deployment, "the branch owner of the respective branch..." -- can you explain scenarios in which there will be different "branch owners?

@dlkibbe From SAFe :

Continuous Integration (CI) focuses on taking features from the Program backlog and implementing them. In CI, the application of design thinking tools in the problem space focuses on refinement of features (e.g., designing a user story map), which may motivate more research and the use of solution space tools (such as user feedback on a paper prototype). After specific features are clearly understood, Agile Teams implement them. Completed work is committed to version control, built and integrated into a full system or solution, and tested end-to-end before being validated in a staging environment.

Continuous Deployment (CD) takes the changes from the staging environment and deploys them to production. At that point, they’re verified and monitored to make sure they are working properly. This step makes the features available in production, where the business determines the appropriate time to release them to customers. This aspect also allows the organization to respond, rollback, or fix forward when necessary. © Scaled Agile, Inc. Include this copyright notice with the copied content. Read the FAQs on how to use SAFe content and trademarks here: https://www.scaledagile.com/about/about-us/permissions-faq/ Explore Training at: https://www.scaledagile.com/training/calendar/

dlkibbe commented 4 years ago

Thanks @jamesmrollins and @anthonycpichardo for these explanations. Greatly appreciated.

lijenn commented 4 years ago

w/ @staceypark @lzim @anthonycpichardo @jamesmrollins @dlkibbe @branscombj @dkngenda @anazariz

Hypothesis: Repo designs should mirror and be reinforced by the #1423 teampsd 2.0 organization flow map (be by functions & the reverse of our originally proposed org chart) so the root structures for each folder become more standardized.

We wanted to check out some assumptions & proposals for the TeamPSD repo:

  1. Not only is GH the "hub", but the TeamPSD repo specifically is the GitHub hub.
  2. All workgroups will be using GH to some extent (See James's proposal for SIM UI Repo).
  3. We will put everything on the master branch (meaning any previous use of branches for organization instead of version control will be fixed and completely phased out).
  4. The main folders will be by workgroup + one resources folder for all_teampsd (similar to our Teams channels organizations)
  5. Each workgroup folder will include a
    1. Resource folder specific to the workgroup (Current Resources folder contains a mixture of both resources needed for the entire team OR workgroup specific resources)
    2. Any "function" folders as determined by our final teampsd flowmap design (see #1423 & iv.).
    3. .README - explains what's in the folder as well as linking to other repos that workgroup may use
    4. For example, if we decide on "Compliance" "Funding" "MTL Operations" "Research data" "Dissemination" as our main "functions", then: facilitate_workgroup folder would have the sub_folders of "Resources" & "MTL Operations" & the .README would explain what's in these folders and link to mtl.how image
  6. Anthony's Refactoring Files using Atom and Github Desktop Prototype should make all the structural change dependencies introduced by re-pathing quicker to fix than previous methods.
jamesmrollins commented 4 years ago

@lijenn I am a little concerned about the ability to standardize, non-standard processes. In other words, if we organize functionally how will the flows work when we engage in functions with perhaps different information needs and process steps? In my proposal for a separate Sim UI repo above, its task purpose would be for updating and maintaining the Sim UI as it is expressed in Epicenter through a CI/CD process. How do you envision us working asynchronously with everyone in a Master Branch?

lijenn commented 4 years ago

@jamesmrollins Functions & Standardization: Hmm, I think the concern about about functions should be first thought through #1423 (TeamPSD 2.0 Organizational Flow map, I'm also setting #1423 as a potential "blocked by" for this issue) as it should represent the structural flow of our team. Regarding functions with a non-standard process or requiring different needs and steps, I believe that would still be alright as it could be classified as a specific niche within the workgroup(s) it's in?

This is a really good point to keep in mind for the Organizational Flow map...do you have a specific example in mind for a function with non-standard processes? I feel that as long as there is some sort of approach, it is already standardized in a sort of way.

Master branch: This was mainly addressing most of the current existing branches in the teampsd repo right now. Specifically, branches that were used for organization, like the videos, hexagon_icons branch, etc. contain both some sort of work & some form of organization.

In most cases, branches should only contain work that can eventually be reviewed, integrated, and pushed to the Master Branch. Similar to your branch proposal for the SIM UI repo, except the teampsd repo would have non-protected branches and/or contain branches required in the automation process (e.g. GH Actions Bookdown stores the documents in the gh-pages branch).

The recommendation for our current branches would be to:

jamesmrollins commented 4 years ago

Hi @lijenn, Thanks for responding to my feedback.

Regarding Functions and Standardization:

I agree that we should understand what our functions are, how they are dependent and what standard processes are used to update them. I also hear (read) you saying that your scope is limited to the Team PSD repository. I am perhaps overly sensitive to my own needs and am concerned with being constrained underneath the Team PSD repository for the Sim UI repository needs. Also, I don't know if you have seen the image below. I thought you might be interested in how different repositories can be pinned to create a directory.

image

Master branch: This was mainly addressing most of the current existing branches in the teampsd repo right now. Specifically, branches that were used for organization, like the videos, hexagon_icons branch, etc. contain both some sort of work & some form of organization.

Got it. Agree.

in most cases, branches should only contain work that can eventually be reviewed, integrated, and pushed to the Master Branch. Similar to your branch proposal for the SIM UI repo, except the teampsd repo would have non-protected branches and/or contain branches required in the automation process (e.g. GH Actions Bookdown stores the documents in the gh-pages branch).

Got it. Agree.

lijenn commented 4 years ago

@jamesmrollins You got it! I'm a bit limited to seeing how other repo structures could look like. Thanks for the reminder of the Dept. of Veterans GitHub Organization! This definitely helped as I was playing around with making a test organization, but had some trouble with seeing how it would look with more members and repos.

@lzim @staceypark @anthonycpichardo @jamesmrollins @dlkibbe @branscombj @dkngenda @anazariz Also brings back the question of how people would feel if we had a TeamPSD GitHub Organization. From what Stacey and I gathered, pros would include:

However, we can technically already do some of these without the need of GH Organization. How do other folks feel about us shifting to learn this feature of GitHub? If we're trying to shift to GitHub becoming the Hub Platform, it probably makes sense for us to at least try it and see how we feel about it?

I think the Functions and Standardization concern with repos is that if it could be potentially confusing if every repo's structure was different, but I hear your limitations of being constricted by the TeamPSD repo could prevent the SIM UI repo's needs. We should continue to check things out as we're thinking about the repo design. It would probably also be a good idea to shift back to Anthony's proposal of first deciding what are the needs of each repository before moving too fast with the structure.

anthonycpichardo commented 4 years ago

We will put everything on the master branch (meaning any previous use of branches for organization instead of version control will be fixed and completely phased out).

@lijenn I think you're aware but making it clear that the exception to this rule would be the GH-Pages

anthonycpichardo commented 4 years ago

I share a similar concern that if we force a standard folder structure for the repos it could cause limitations depending on how one-size-fit-all we make it.

As you mentioned I think if we can decide on what repos we need and how they're broken up then we can determine a structure that makes sense and is consistent across repos.

jamesmrollins commented 4 years ago

@lijenn @anthonycpichardo
I will have to dig into "GitHub Organization." I'm not familiar with that yet. Do you have an info link?

Pursuant to Anthony's comment, the Sim UI WG folder has no where near the information it would have if used for actual development activities. Take a look at https://github.com/jamesmrollins/psychtools for an example of how we have organized a similar web project.

anthonycpichardo commented 4 years ago

Heres a link @jamesmrollins Github Organizations

jamesmrollins commented 4 years ago

Oh! 🙄 I can be such a silly boy! That's what I was sharing with @lijenn above with the VA Organization repositories.

lijenn commented 4 years ago

Haha! @jamesmrollins I also found these videos to be helpful in setting up an organization or just understandings GH Organization a bit better: 1) Setting up an Organization (slightly outdated, but still carries some of the steps needed) 2) Quick GH Overview on Organizations & Teams (Not MS Teams)

dlkibbe commented 4 years ago

@jamesmrollins just curious as I read through all this exchange, @lijenn had asked above, "do you have a specific example in mind for a function with non-standard processes?" For my education do you have an example from past experience with MTL work?

jamesmrollins commented 4 years ago

Hi @dlkibbe - I mistakenly got the impression that we were going to standardize processes, that support functions, so they all operated the same within all of the functions. That would be difficult if not impossible, with the differences between the functions and their related task steps supported processes.

anthonycpichardo commented 4 years ago

Branching design

Linking this proposal to this issue card.

@lzim @jamesmrollins @staceypark @lijenn @dlkibbe

User-centered MVP Hypothesis: Does this branching structure support the workflows that exist for our various workgroups?

Across our different repositories does this methodology hold? If not, what is the breaking point?

What learning needs are there for people to understand what this process would look like?

Appreciate any feedback you all can give for your respective workgroups.

anthonycpichardo commented 4 years ago

From Lindsey: Sharing a link is a great way to "fail fast and cheap" with your User-Centered Hypothesis. I'm guessing that this is yours... User-centered MVP Hypothesis: This existing online training will be effective for users to understand and set up branching (i.e., from the existing online community of trainings available for free for many open standards, in other words trainings Team PSD doesn't have to create or keep up with)?

All: Please keep in mind for Monday & week3 MVPs: Make it an orienting MVP across the users' whose pain it solves. Keep users' learning needs in mind.

For this, please consider the "How Might We" (HMW) Exercises from our Design Thinking Workshop manual for our monthly sprints:

This exercise produces a prompt like this: "How might we? Make sure that all Team PSD users who will need to use the branching strategy will learn it as it is defined during july_epic?

We all have learning needs. Keeping an empathic synthesized understanding of all your users' needs clarifies, not just a) how this should be set up, but even more importantly, b) how we should go about setting this up to support users' learning needs?

This will be VERY cross-cutting for the second half of july_epic

a) your own workgroups' needs, and

b) your understanding of how those needs synthesize with other workgroups' needs you've heard during this sprint.

jamesmrollins commented 4 years ago

@anthonycpichardo Thanks for sharing this with me.

This article was a good find Anthony - thanks for sharing with me.

anazariz commented 4 years ago

@anthonycpichardo I'm aligned with the "feature branch" model (including the one with the dev lane) proposed in the link you cited. In fact, at Blue Shield we used a similar model using SVN(Apache Subversion), with a few more "Feature" lanes. A couple of items that could also be included in these models are the database rollback (which Sairam Krish has touched on) and the hotfix lane (which Vincent Driessen talks about. However, currently, I'm not sure if a hotfix lane would be useful to us since we don't have multiple production incidents competing with development items for a lane. In general, I think that the overall architecture precedes the number of lanes, which can be scaled up as needed.

dkngenda commented 4 years ago

@anthonycpichardo this was a great read, thanks for sharing!

anthonycpichardo commented 4 years ago

@dkngenda You can definitely do all of this through R Studio in terms of checking in code if you'd like to. There's also tools like Git (using the command-line) and Github Desktop that will also facilitate the workflow.

R Studio Github Processes

dkngenda commented 4 years ago

@anthonycpichardo Thanks for sharing! I have been using this workflow to implement branching. I think the workflow proposed in what you shared is simpler. The main difference I think is that I usually create branches from within R studio instead of GitHub and set what my upstream git repository is through command-line.

anthonycpichardo commented 4 years ago

Building on this discussion @lzim @jamesmrollins @staceypark @lijenn @anazariz @dkngenda.

SAFe (the Agile framework used at the VA) CI/CD Quality describes how we can flow from environment to environment (branch to branch) as we're developing and the kinds of work and tests that would take place in each stage.

Take a look, let me know what you think, and bring up any questions you may have.

image

Acronyms: TDD - Test Driven Development (Functionality testing) BDD - Behavior Driven Development (Usability testing)

image

anthonycpichardo commented 4 years ago

@jamesmrollins created these maps which describe the workflow: https://app.lucidchart.com/documents/edit/17ed93aa-32e9-4a80-a0b6-de154e9978d4/rnyBtdhHAlzYp https://app.lucidchart.com/documents/edit/47d3b943-e731-4520-8715-a115a3a51d54/FyfyXcmCcLBTD Closing