pha4ge / pipeline-resources

Bioinformatics Pipeline and Visualization resources
https://pha4ge.org/bioinformatics-pipelines-and-visualization/
Apache License 2.0
49 stars 27 forks source link

Observations about Proposed Standards for Public Health Bioinformatics Software #33

Open svarona opened 6 months ago

svarona commented 6 months ago

My group and I have been reviewing the Proposed Standards for Public Health Bioinformatics Software document from the perspective of a team dedicated to the development of analysis pipelines, and we have some observations, from our humble opinion and experience, about the document, that we hope could help in its development.

We believe it needs to be clearly defined whether these are minimum requirements, best practices, or guidelines, something we think its already under discussion in the meetings. We also think it should be clarified whether these are standards for pipelines or software, as some points may not apply to pipelines, and reversal.

We also believe that, in addition to indicating how this is going to be evaluated as Frank is working in his PR (https://github.com/pha4ge/pipeline-resources/pull/32), it could be useful to provide another section per point with resources, such as links or documents, that can assist developers with each of the standards.

Next, I will describe our observations on some of the points:

Here is a just proposal of reorganization to reduce the list to 10, which I believe was one of the next objectives:

  1. Publicly-Accessible Repository
  2. Version Control
  3. Pipeline Documentation
    • Open-Source License
    • Contribution, Authorship, and Verified Point of Contact
    • Maintenance Capability
    • Conflict of Interest Statement
  4. Pipeline Guidelines
    • Documentation for Local Installation and/or Remote Access
    • Software Functionality
    • Statement of Need with Respect to Public Health Pathogen Genomics
    • Example Usage
    • Container/Packaged Software
  5. Software Testing
  6. Community Guidelines for Contribution and Support
  7. Benchmark/Validation Datasets
  8. Common File Formats
  9. Reference Data Requirements