mozilla-services / dependency-observatory

Mozilla's Dependency Observatory
Mozilla Public License 2.0
2 stars 6 forks source link

Dependency Observatory

CircleCI Coverage Status

Dependency Observatory is a web service for collecting data about open source software dependencies (or package) and assessing how risky they are to use.

It is alpha software and subject to breaking changes.

We're still working on package scores, but aim to capture the direct and transitive impact of using a package on:

  1. amount of code
  2. number of maintainers and publishers trusted
  3. code quality and chance of using vulnerabile code

Architecture

Dependency Observatory consists of:

API and Routes

Endpoint                                Methods             Rule
--------------------------------------  ------------------  ---------------------------------------------------------------------------------
views_blueprint.index_page              GET, HEAD, OPTIONS  /
dockerflow.heartbeat                    GET, HEAD, OPTIONS  /__heartbeat__
dockerflow.lbheartbeat                  GET, HEAD, OPTIONS  /__lbheartbeat__
dockerflow.version                      GET, HEAD, OPTIONS  /__version__
views_blueprint.queue_scan              OPTIONS, POST       /api/v1/scans
views_blueprint.get_scan                GET, HEAD, OPTIONS  /api/v1/scans/<int:scan_id>
views_blueprint.read_scan_logs          GET, HEAD, OPTIONS  /api/v1/scans/<int:scan_id>/logs
views_blueprint.faq_page                GET, HEAD, OPTIONS  /faq
views_blueprint.show_package_changelog  GET, HEAD, OPTIONS  /package_changelog
views_blueprint.show_package_report     GET, HEAD, OPTIONS  /package_report
views_blueprint.render_scan_logs        GET, HEAD, OPTIONS  /scans/<int:scan_id>/logs
views_blueprint.get_condensate_graph    GET, HEAD, OPTIONS  /score_details/condensate_graphs/<int:graph_id>
views_blueprint.get_graph               GET, HEAD, OPTIONS  /score_details/graphs/<int:graph_id>
views_blueprint.get_scoring_graph       GET, HEAD, OPTIONS  /score_details/score_component_graphs/<int:graph_id>/<string:package_report_field>
static                                  GET, HEAD, OPTIONS  /static/<path:filename>
views_blueprint.get_statistics          GET, HEAD, OPTIONS  /statistics

Analysis UI

  1. user visits / and enters a package or dependency name to analysis
  2. the UI makes a HEAD request with query params to see if a report exists e.g. /package_report?package_manager=npm&package_name=%40hapi%2Fbounce&package_version=2.0.0
  3. if /package_report returns 200 UI redirects to the report page otherwise it makes a POST /api/v1/jobs to start an analysis job
  4. shows logs for the analysis e.g. /jobs/scan-score-npm-package-dffc9843/logs
  5. which redirects to /package_report when the analysis is successful

Deploying

  1. fetch the service container image:
docker pull mozilla/dependency-observatory:latest
  1. optionally, fetch images scan or analysis environment for the languages you're interseted in:
mozilla/dependency-observatory:rust-1
mozilla/dependency-observatory:node-10
mozilla/dependency-observatory:node-12
  1. create a postgres database

  2. modify kubernetes/deployment.yaml (e.g. remove DB point to your DB; update creds in DSN) then kubectl create -f kubernetes/

Developing

Running the service locally

See README.md in kubernetes/ to run the service on a local minikube cluster.

We also use docker-compose to format web resources and create database migrations.

Running tests

Developing worker jobs

Finding worker code

Adding an analysis job

Analyzing untrusted code

NB: try util/run_cwd_in_image.sh

  1. add a scanning / analysis environment in <langname>-<lang major version> format to docker-compose.yml or pick and existing one

  2. add analysis steps (e.g. static (SAST) or dynamic analysis (DAST)) to run in a scan_env to scan_envs/docker-entrypoint.sh. They should an individual command as a CLI arg with env vars as keywords and output jsonlines http://jsonlines.org/

  3. rebuild the image e.g. docker-compose build node-12

Saving analysis results
  1. update depobs/worker/tasks.py to start jobs to analyze untrusted code or fetch additional data and save the serialized results to the database (using depobs/worker/serializers.py or depobs/models/)

  2. register the function as CLI command in depobs/worker/main.py

  3. start a shell on the API worker kubectl exec -it api-<tab> -- /bin/bash (NB: if you don't need to run kubernetes jobs, docker-compose exec api will mount the CWD for faster development)

  4. test the command e.g. with python depobs/worker/main.py npm scan

  5. run to kubectl exec -it svc/db -- /bin/bash -c 'su postgres -c "psql dependency_observatory"' and inspect the database

Saving analysis results to the package report
  1. In depobs/database/models.py, add a method to PackageGraph to fetch data and return a dict indexed by package id

  2. In depobs/worker/scoring.py, add a class inheriting from ScoreComponent (see commends in ScoreComponent for details) to populate PackageReport fields froms the fetched data

  3. Preview graphs with the PackageReport fields at /score_details/score_component_graph/<int:graph_id>/<string:package_report_field>

  4. Add migrations to update the score_view, and report_score_view database views to add the data as a factor in scoring

  5. If necessary, update PackageScoreReport.report_json and the UI to display and explain the new fields

Developing the web server and API

Finding web code

Web server code can be found in depobs/website/. Views currently live in depobs/website/views.py and jinja2 templates in depobs/website/templates/. It uses models from depobs/database/models.py and calls out to code in other directories.

Scripts

Formatting web resources

  1. Run docker-compose build format-web then docker-compose run format-web to format. Edit the command docker-compose.yml to format resources other than static/do.js and static/do.css

Adding a database migration

Autogenerating from changes to SQLAlchemy declarative metadata

  1. run docker-compose up -d api if the api isn't already running with docker-compose

  2. run ./util/flask_db.sh upgrade to ensure your local database is up to date

  3. autogenerate a migration script e.g. ./util/flask_db.sh migrate -m "add foo column to bar table" to create a file with that message in /migrations/versions

  4. run sudo chown -vR $(whoami):$(whoami) migrations/versions so root doesn't own the script

  5. review and edit the script keeping in mind the limitations of alembic detecting changes https://alembic.sqlalchemy.org/en/latest/autogenerate.html#what-does-autogenerate-detect-and-what-does-it-not-detect

Adding a database view

  1. run docker-compose up -d api if the api isn't already running with docker-compose

  2. create an empty migration file e.g. ./util/flask_db.sh revision -m "add spam_view"

  3. run sudo chown -vR $(whoami):$(whoami) migrations/versions so root doesn't own the script

  4. add an execute operation with the raw SQL (refer to existing view migration scripts) https://alembic.sqlalchemy.org/en/latest/ops.html#alembic.operations.Operations.execute

  5. For programmatic access to the view, add a table with __table_args__ = {"info": {"is_view": True}} to the declarative table definition in database/models.py (to skip it in autogenerated migrations)

Review and test the migration

  1. run docker-compose up -d api if the api isn't already running with docker-compose

  2. run ./util/flask_db.sh upgrade --sql and ./util/flask_db.sh downgrade --sql and review the generated SQL

  3. run upgrade then downgrade without --sql to make sure the migration applies without errors

  4. commit the migration script to version control