lzim / teampsd

Team PSD is using GitHub, R and RMarkdown as part of our free and open science workflow.
GNU General Public License v3.0
9 stars 23 forks source link

Task: Automatic Model Update Process #1220

Closed jamesmrollins closed 3 years ago

jamesmrollins commented 4 years ago

Problem The current model crosswalk table is a spreadsheet that lists the standard label for variables and attempts to link the standard name to the actual labels used on other resources (i.e., Team Data Table, Vensim model). Standard labels can be difficult to implement because the Sim UI often needs unique labels for the same function in order to execute code. Furthermore, most of the models were developed prior to the establishment of the code; therefore, are not standardized. Finally, there is no procedure documented for editing, updating or quality checking label crosswalks creating a single point of failure.

Mission Identify and document the current standing operating procedure for documenting resource labels. Determine if there are ways to automate the functions in order to reduce errors and to ease demand.

lzim commented 4 years ago

@jamesmrollins @anazariz

Note - We say in our grants that we will document our Vensim DSS models using SDM-DOC Goals: 1) Error-checking and Standardization 2) Model documentation for review by scientists and others

1) https://www.systemdynamics.org/SDM-doc 2) http://wayback.archive-it.org/10432/20181121203235/http://lm.systemdynamics.org/tools/sdm/Handbook%20Model-A.html#a105

Cross-ref: #784 Cross-ref: #886 - How can we get to a more streamlined file upload for Team Data Tables? This came up during our 4/1 Support Workgroup Leads meeting. BROADER GOAL: Get rid of anything that manually has to be done with the Epicenter platform.

jamesmrollins commented 4 years ago

Requirements Testing and Validation Checklist

Problem Statement: Use of labels between Team Data files, Vensim model files and Sim UI labels is very difficult to standardize and often results in errors. The standardization of labels, wherever possible, aids in mapping variables across the various platforms. In the model-Sim UI validation process, values from the simulation are compared, by hand, against values returned from a client session of the model.

Requirements

1.0 Develop a database routine that will identify exceptions between two model versions and the Sim UI label map.

2.0 Develop a automated disposition process that will record inputs and provide a record of accountability for actions in the record.

3.0 Develop a process to host an SDM-DOC.html file as a webpage, under a "technical specifications" heading on mtl.how/demo. (see edit below from Support Workgroup Meeting 4/8/2020; see below).

Algorithm

image

Wire Frames

  1. MTL/DEMO
  2. html table formatted with VA standards (Myriad Pro, colors)
lzim commented 4 years ago

@jamesmrollins

Decisions at 4/8/2020 Support Workgroups #1220 - Requirement 3 - Provide Model Documentation for Scientific Audiences

Completed Verfication of SDM-Doc: James has taken the SDM-DOC java app and was successfully able to generate a .html from the MTL 2.0 .mdl Vensim DSS.

  1. Decision James will edit this Issue Task Card for putting this CSS Design file.
  2. We develop a CSS that specifies our VHA Graphics/Style requires
  3. Host the .html in our VA style on mtl.how/demo - "Model Documentation" in "Technical Specs."
staceypark commented 4 years ago

@jamesmrollins Thanks for moving this to the upcoming epic following the Monday WG Leads call. Please make sure to also update the milestone and epic on the right-hand side.

lzim commented 4 years ago

Discussed at #hqhuddle on 4/24/2020

  1. As we develop our improved - "Master Crosswalk" process (Protocol)
  2. Need to coordinate with cross-ref: #1367 (SQL Style Guide - Ash) and #1366 (R Style Guide - Anthony).

Master Crosswalk Goal: Is that standards are developed and every instance of a Team PSD or MTL variable instance, data definition, code base (e.g., interdependent .Rmd, data and .R; SQL joins etc.) in a way that is consistent across Team PSD workgroups.

Next Steps:

  1. @anazariz and @anthonycpichardo we re-review @jamesmrollins algo above and requirements above, learn a bit more about the old master crosswalk process and improvements we seek, and work to develop SQL and R standards that are consistent with a comprehensive Team PSD process.
  2. Additional SQL and R recommended style/standards will be presented for review at the Support Workgroups Meeting on Wed 4/30/20.
lzim commented 4 years ago

Team GitHub Repos **Cross-ref

1192

1220**

Lindsey will put this together with visual icons for Team PSD Values and Team PSD Principles.

It would be great to get a Visual Map that also shows how these go together.

Team PSD Scientific Values guide additional Participatory and Open Science principles:

Team PSD Project Management Principles

Team PSD integrates Waterfall principles because:

Team PSD integrates Agile principles because:

Team PSD integrates Waterfall and Agile principles because:

Team PSD integrates Lean principles because:

Team PSD integrates Scrum principles because:

Team PSD Pain points we are working to better address:

Continuous Collaborative Iteration Cycles (e.g., “DevOps”)

VA GitHub Repo

Ad Hoc vets.gov - benefits tracking NICE Examples for #1220 https://github.com/department-of-veterans-affairs/vets-website

https://github.com/department-of-veterans-affairs/caseflow

va.gov https://github.com/department-of-veterans-affairs/va.gov-team

NICE Examples for Team PSD Manual: Plain English for #1192 https://github.com/department-of-veterans-affairs/va.gov-team/blob/master/platform/working-with-vsp/orientation/repo-guidelines.md

With Maps https://github.com/department-of-veterans-affairs/lighthouse-facilities

Git for Version Control

https://towardsdatascience.com/getting-started-with-git-and-github-6fcd0f2d4ac6

https://rogerdudler.github.io/git-guide/

DevOps for Automating Team Dependency Flows and Reducing Errors and Rework

lzim commented 4 years ago

Discussed at Support Workgroup Meetings on 4/29/2020

  1. Talked about the interdependencies of the requirements above and the 4/24/20 #1367 (SQL Style Guide - Ash) and #1366 (R Style Guide - Anthony).
  2. Lindsey added information about integrating with our GitHub Workflow and linked to #1192.
  3. We considered "refactoring" or downstream corrections of all instances, which seemed like a new requirement.
  4. Follow-up w/James next week.

@anazariz @anthonycpichardo @jamesmrollins @staceypark

lzim commented 4 years ago

BTW: Thanks to @anazariz for encouraging me to get the Team PSD Manual "Project Management" (Column A) information out there in terms of how/why we enlist the synthesized project management approaches we do. 🏆

lzim commented 4 years ago

@jamesmrollins Is the algorithm diagram in LucidCharts?

I was thinking we could review it further with our insights from yesterday. I also see that next steps from our 4/24/2020 post said that 1) Anthony & Ash would review the algo diagram, and 2) the current Master Crosswalk.

@anthonycpichardo @anazariz Have you seen the current master crosswalk?

lzim commented 4 years ago

Happy Friday Support Workgroups! It’s May! 🌳 🌻 ⛰️ 🏖️

@staceypark @jamesmrollins @anazariz @anthonycpichardo

Great work to keep this moving forward! 👏

BLUF:

  1. No changes to purpose, problem definition and current requirements (3/19 & 4/7).
  2. But, requirements for automated integration and standardization of “... every instance of a Team PSD or MTL variable, data definition, code base...[to stay]...consistent (4/24) requires increased use of code and the GitHub platform for efficient version control.
  3. Setting this up requires training/team learning.

NO CHANGE TO:

ADDITIONAL REQUIREMENTS:

  1. 4.0 Comprehensive integration of Team PSD code at mtl_code (e.g., .md, .R, .sql, .html, Vensim) (4/24 Cross-Ref #1366, #1367) - requirements above are very sim focused
  2. 5.0 Integration of “every instance” should integrate with Team PSD GitHub Workflow (4/29 Cross-Ref Team PSD Manual #1192).
  3. 6.0 Automated testing should include re-factoring for automated Team PSD/MTL-wide version control.

NEXT STEPS: GitHub TRAINING PLAN (6 mini courses to learn)...

GitHub: YAML Configuration, Continuous Integration, Delivery & Documentation

GitHub Training

TRAIN 1: GitHub Apps - Automate Repetitive Tasks

https://lab.github.com/githubtraining/getting-started-with-github-apps

We'll answer common questions like:

TRAIN 2: GitHub Actions - templated workflow with testing

https://lab.github.com/githubtraining/github-actions:-continuous-integration

We'll answer common questions like:

TRAIN 3: GitHub Packages

https://lab.github.com/githubtraining/github-actions:-publish-to-github-packages

We'll answer common questions like:

TRAIN 4: GitHub Actions - Docker

https://lab.github.com/githubtraining/github-actions:-write-docker-container-actions After completing this course, you will be able to:

TRAIN 5: Release integration w/our Feature Tracker “SLDC”

https://lab.github.com/githubtraining/create-a-release-based-workflow

TRAIN 6: Alternatives: Continuous Integration with Circle CI- automatically test changes made to your project

https://lab.github.com/githubtraining/continuous-integration-with-circleci

Developers integrate code into a shared repository several times a day. With such frequent code changes, how do you ensure your code is bug free? Continuous Integration (CI) is an approach to software development in which tests run automatically anytime code is changed, saving you time and giving your team improved reliability. Continuous Deployment or Delivery (CD) refers to whatever happens after these tests run. If they pass, your new code can be automatically deployed to production.

Summary and Proposal to use YAML Config as our Process

https://yaml.org/spec/1.2/spec.html

Key Idea 1: Adoption of the configuration file

After developing our standards, when a user creates a new project or it’s on the settings page, we could suggest a functional configuration file with a minimal setup. And making clear where to put global configurations.

Key idea 2: Configuration file and database

The settings used in the build from the configuration file (and other metadata) are human and machine readable across all Team PSD code.

Key idea 3: The build process

Benefits of expanding Team PSD use of YAML include consistency with existing Team PSD style, standards, SOP and team Skills:

A. Every JSON file is also a valid YAML file. The design goals for YAML are, in decreasing priority:

  1. YAML is easily readable by humans.
  2. YAML data is portable between programming languages.
  3. YAML matches the native data structures of agile languages.
  4. YAML has a consistent model to support generic tools.
  5. YAML supports one-pass processing.
  6. YAML is expressive and extensible.
  7. YAML is easy to implement and use.

B. YAML information is used in two ways: for machine processing, and for human consumption.

...Unlike JSON, yaml supports comments

C. Other considerations Ansible Playbooks By integrating their software with YAML, Red Hat developed Ansible, an open source software provisioning, configuration management, and application deployment tool.

We should work through these trainings on our own and update each other with recommendations in this thread.

NICE Examples for #1220

https://github.com/department-of-veterans-affairs/vets-website https://github.com/department-of-veterans-affairs/caseflow https://github.com/department-of-veterans-affairs/va.gov-team

NICE Examples for Team PSD Manual:

Plain English for #1192 https://github.com/department-of-veterans-affairs/va.gov-team/blob/master/platform/working-with-vsp/orientation/repo-guidelines.md https://github.com/department-of-veterans-affairs/lighthouse-facilities

anthonycpichardo commented 4 years ago

@jamesmrollins Is the algorithm diagram in LucidCharts?

I was thinking we could review it further with our insights from yesterday. I also see that next steps from our 4/24/2020 post said that 1) Anthony & Ash would review the algo diagram, and 2) the current Master Crosswalk.

@anthonycpichardo @anazariz Have you seen the current master crosswalk?

@lzim Ash and I reviewed the master crosswalk that is hosted on OSF when thinking through the parser. Side note: currently, we aren't a part of whichever project the file is hosted on though and needed @jamesmrollins to send it to us. Can we get added to the project?

jamesmrollins commented 4 years ago

@jamesmrollins Is the algorithm diagram in LucidCharts?

I was thinking we could review it further with our insights from yesterday. I also see that next steps from our 4/24/2020 post said that 1) Anthony & Ash would review the algo diagram, and 2) the current Master Crosswalk.

@anthonycpichardo @anazariz Have you seen the current master crosswalk?

@lzim @anazariz @anthonycpichardo Its in there now . . .