Provide the user a simple UI to work with integrations, including: creation, packaging, and publishing.
Background
Users don’t have access to a simple workflow to initialize, deploy, and publish Integrations. Maintaining integrations is hard, as it requires a configuration file and multiple directories with correctly set-up resources. To make the process easier for users, we will develop a CLI tool that enables:
Create a template project for a new integration from scratch.
Deploying a local integration template to an OpenSearch cluster.
Needs to check if the cluster is healthy with the integration plugin.
Since we only need to push to the repository index, a cluster health status of YELLOW should suffice.
The integrations plugin needs its own healthcheck endpoint, unless there’s already a request that shows the installed plugins.
Packaging an integration to upload to their local cluster.
Packaged as a zip file, uploaded via a mechanism similar to plugins.
The CLI is primarily meant for ease-of-use with regard to the integrations ecosystem. There are 3 use cases being targeted.
A new Integrations user who wants to make an integration for their toolkit. They know roughly what they want, but not an exact dependency list or the components they need to supply.
An experienced user who knows precisely what they want, and they want to be able to make it happen quickly and accurately.
An integration developer, who will have an integration set up, but they want to quickly iterate on it and share it with confidence that it won't cause breakage for them or their users.
For now, the integrations CLI is a part of the Observability repository. When integrations are moved to their own repo, the CLI will move with it.
Requirements
As a user, I can create a new integration with minimal difficulty—in less than 15 minutes or so.
integrations-cli create
The user is presented with an interactive configuration that lets them select: integration name, license (SPDX-compatible), data source, schema, catalog, categories, repository.
The schema will be selected from a closed list, the list of valid schemas must be maintained somewhere.
Once the basic options are selected, the user is presented a list of collections. They can select multiple collections to add to their integration.
A collection contains info, dataset, labels, schema.
The collection accepts an input_type that the user further selects. The input type defines how the data should be interpreted, for example, a logfile.
As a user, I can check if the integration I’ve created is correct, such that it can be readily uploaded to my cluster without issue.
integrations-cli check
As a user, I can upload a local integration template to a remote OS instance and see the integration template in the repository.
integrations-cli package
Will we handle pushing it to their cluster or is that their responsibility?
Yes, we will POST to the _integration/repository endpoint.
If something goes wrong, how do we rollback?
Server’s responsibility, we should just listen for an error response.
As an integration developer, I will be informed if my integration is invalid before pushing to a remote repository.
Done via a commit hook. This should be added to a git repository as part of integration generation.
As a user, I should not need to have any language-specific tooling installed to run the CLI.
Decision still pending on how to distribute it.
One option is converting to an executable with pyinstaller.
For now, we ignore this requirement and require the user has Python.
As a user, I can use the tool even if the OS cluster is running serverlessly.
Design Considerations
Consolidated Validation Logic
The project will have two components that need to work together. Consolidating the logic for the two components is important to avoid inconsistent results.
The integration consuming API
The CLI integration validator/template generator
To facilitate consistent logic for each component, we need to settle on using a standard system for organizing that logic. After consideration, we've settled on:
JSON Schema is a mature declarative language for annotating and validating JSON documents.
Selected due to the wide amount of cross-language support. JSON Schema is much more purpose-built for the task, and requires less work to port. Swagger will still be used for API specification.
In addition to JSON validation, there is more complicated version checking logic that we’ll have to write. How do we maintain consistency with this logic? Some options:
Interop between the two tools.
Shared library that both tools depend on.
Duplicate logic.
Internal CLI that can be piped to.
Commands
There has historically been a lot of disagreement on what the different verbs regarding this process mean. Please see the glossary at the end of this document. The exact usage of these verbs must be communicated to the users.
For processing CLI arguments, we will be using the click library, for consistency with existing OS CLI tools.
CLI Precise Description
integrations-cli --help
Usage: integrations-cli [--version] [--help] <command> [<args>]
integrations-cli create Create a new Integration from a specified template
integrations-cli check Analyze the current Integration and report errors
integrations-cli package Zip the current Integration so it can be uploaded to the Integration Plugin
integrations-cli create --help
Usage: integrations-cli create [--help] [--presets] <name> [<args>]
Create a new Integration from a specified template
Arguments:
name The name of the integration
Options:
--help Show this help page
--presets List available presets
--preset <preset> Generate using the provided preset
--directory <dir> Specify the directory in which to create the integration (default: './<name>')
When create is run without a preset, the user is interactively shown the following prompts:
Creating new integration '<name>'
Integration description (default: ''):
License (default: 'Apache-2.0'):
Data Source Examples:
- kubernetes
- nginx
- otel-collector
Select a Data Source:
Data Source Version: (default: 'latest'):
Schema Version Options:
- 0.1
- 0.2
- 1.0
- latest
Schema Version (default: 'latest'):
Integration labels (comma-separated list, default: none):
Catalog for the Integration (default: 'observability'):
Would you like to add collections interactively? (y/n):
integrations-cli check --help
Usage: integrations-cli check [--help] [<dir>] [<args>]
Analyze the current Integration and report errors
Arguments:
dir The directory of the integration to check (default: .)
Options:
--help Show this help page
integrations-cli package --help
Usage: integrations-cli package [--help] [<dir>] [<args>]
Zip the current Integration so it can be uploaded to the Integration Plugin
Arguments:
dir The directory of the integration to package (default: .)
Options:
--help Show this help page
Tests
A difficult part of testing will be ensuring that the front-end validator does not certify an integration that the API will reject. Using a stable JSON Schema library for the task will be critical, but there should also be integration tests that check the CLI upload against a running cluster with Observability.
To ensure validation is functioning as intended, we should find and include a fuzzing framework, such as JSON Schema to Elm.
New endpoints will need Pen Testing.
Implementation Plan
Short-Term Deliverable
What is our first deliverable?
CLI Library that can generate/zip a preset integration.
We may want to consider using a templating engine instead of the standard json module.
To research: which templating engine?
Some options: Jade, Pug, Mustache, HandlebarsJS, Jinja2
For now, use the standard module, no need to over-complicate this.
Goals for 2.7
Initializing Integrations
Deploying Integrations
Goals for 2.8+
Publishing Integrations
Glossary
Build (= Package): Packaging a folder containing an integration into a zip file. The zip should be able to be deployed to the cluster as-is.
Check (= Validate): Ensuring that the integration is correct and will be accepted by the server.
Deploy (= Import): Moving a built integration from a local filesystem to a remote cluster. After this step, all references to the integration should be via a cluster index.
Install: Putting an integration in a local filesystem that will be loaded.
Install differs from Deploy by process: Deploy is through an API while Install is on the FS.
Install should not be used for adding new integrations at runtime, it primarily refers to pre-installed integrations.
Integration: a folder containing resources that define how to process and display information generated by a data source.
Creation: Creating a new integration in an empty directory, using a template and user input.
Publish: Committing and pushing a local integration as a PR.
https://user-images.githubusercontent.com/31739405/233336538-7489b6db-3128-48a1-9dc9-75cc6b907b1e.mp4
Integrations CLI Design
Purpose
Provide the user a simple UI to work with integrations, including: creation, packaging, and publishing.
Background
Users don’t have access to a simple workflow to initialize, deploy, and publish Integrations. Maintaining integrations is hard, as it requires a configuration file and multiple directories with correctly set-up resources. To make the process easier for users, we will develop a CLI tool that enables:
YELLOW
should suffice.The CLI is primarily meant for ease-of-use with regard to the integrations ecosystem. There are 3 use cases being targeted.
For now, the integrations CLI is a part of the Observability repository. When integrations are moved to their own repo, the CLI will move with it.
Requirements
integrations-cli create
info
,dataset
,labels
,schema
.input_type
that the user further selects. The input type defines how the data should be interpreted, for example, alogfile
.integrations-cli check
integrations-cli package
_integration/repository
endpoint.pyinstaller
.Design Considerations
Consolidated Validation Logic
The project will have two components that need to work together. Consolidating the logic for the two components is important to avoid inconsistent results.
To facilitate consistent logic for each component, we need to settle on using a standard system for organizing that logic. After consideration, we've settled on:
-06
).In addition to JSON validation, there is more complicated version checking logic that we’ll have to write. How do we maintain consistency with this logic? Some options:
Commands
There has historically been a lot of disagreement on what the different verbs regarding this process mean. Please see the glossary at the end of this document. The exact usage of these verbs must be communicated to the users.
For processing CLI arguments, we will be using the
click
library, for consistency with existing OS CLI tools.CLI Precise Description
integrations-cli --help
integrations-cli create --help
create
is run without a preset, the user is interactively shown the following prompts:integrations-cli check --help
integrations-cli package --help
Tests
A difficult part of testing will be ensuring that the front-end validator does not certify an integration that the API will reject. Using a stable JSON Schema library for the task will be critical, but there should also be integration tests that check the CLI upload against a running cluster with Observability.
To ensure validation is functioning as intended, we should find and include a fuzzing framework, such as JSON Schema to Elm.
New endpoints will need Pen Testing.
Implementation Plan
Short-Term Deliverable
json
module.Goals for 2.7
Goals for 2.8+
Glossary
Build (= Package): Packaging a folder containing an integration into a zip file. The zip should be able to be deployed to the cluster as-is. Check (= Validate): Ensuring that the integration is correct and will be accepted by the server. Deploy (= Import): Moving a built integration from a local filesystem to a remote cluster. After this step, all references to the integration should be via a cluster index. Install: Putting an integration in a local filesystem that will be loaded.