Open YANG-DB opened 1 year ago
Some initial questions:
generate
? My understanding is that we want to retrieve an existing schema, not make a new one.suffix
parameter may be a date or sequence — What does “sequence” mean here? Why these types?
*
backwards-compatible with existing integration behavior that has a ss4o_*-*-*
format?generate - stands for generating the physical mapping from the logical schema
retrieval - is called using the GET /schema/Observability/
API in case multiple versions exist the result will be a dictionary of version:schema, this will retrieve the logical structure only (catalog.json)
The actual instances created from the logical schema (using the mapping) are fetched using this api /schema/{id}/instances
- Do you think a dedicated API for
GET /schema/Observability/_mapping
is needed ?
As you mention correctly currently we assume that all components within a type (logs/traces) are updated within a single version (http, communications, cloud ..)
suffixes might be sequence number as used by data-prepper for example to partition data based on time, date is used for jaeger to achieve the same
ss4o_*-*-*
is the most generic pattern that will catch all index names that start with these characters - it may be too wide for some users so we allow the agility for setting the different naming parts
the schema structure is well defined within the catalog.json which details all the schema inner hierarchy
Once a index / component template was created - we can distinguish its source and version using the _meta
details embedded within the mapping file itself:
Is your feature request related to a problem?
A user would like to generate the physical representation of the schema as its manifested in the database -
What solution would you like?
As part of the user-flow the domain users would like to generate the physical mapping for their schema representing the templates from which the indices (schema instances) would be created and used.
API
GET _plugins/catalog/schema/{id}/generate?version=1.0&prefix=ss4o&suffix=*
This
generate
API takes all the schematic components belonging to schemaid
defined in the catalog repository and generates the appropriate physical mappings representation of the schema.We are using the Naming-convention that will automatically create the names for the index mapping template according to their type.
Possible parameters
version - mandatory semantic version of the schema
type - optional schematic category type (or list of types) to generate (default is `_all)
components - optional schematic category component (or list of components) to generate (default is `_all)
dataset - optional field can contain anything that classify the source of the data - such as nginx (default is
*
)namespace - optional user defined namespace - useful to allow grouping of data such as production grade, geography classification (default is
*
)prefix - optional prefix name identifies the context of the entire naming (default is
ss4o
)suffix - optional suffix name identifies the context of the specific index pattern - may be date or sequence (default is
*
)Example 1:
This following call
GET _plugins/catalog/schema/observability/generate?version=1.0&prefix=ss4o
will generate the following templates:Component Templates
ss4o_1.0_http_component
--> corresponds to thisss4o_1.0_container_component
--> corresponds to thisss4o_1.0_cloud_component
--> corresponds to thisss4o_1.0_communication_component
--> corresponds to thisIndex Templates
ss4o_1.0_logs-*-*
--> corresponds to thisss4o_1.0_traces-*-*
--> corresponds to thisss4o_1.0_metrics-*-*
--> corresponds to thisss4o_1.0_services-*-*
--> corresponds to thisExample 2:
This following call
GET _plugins/catalog/schema/observability/generate?version=1.0&prefix=ss4o&type=[traces,services]&namespace=us
will generate the following templates:Component Templates All the components that are needed to the generate the types described in the
type
list:ss4o_1.0_http_component
--> corresponds to thisss4o_1.0_container_component
--> corresponds to thisss4o_1.0_cloud_component
--> corresponds to thisss4o_1.0_communication_component
--> corresponds to thisIndex Templates
ss4o_1.0_logs-*-us
--> corresponds to thisss4o_1.0_traces-*-us
--> corresponds to thisExample 3:
This following call
GET _plugins/catalog/schema/observability/generate?version=1.0&prefix=ss4o&type=logs&components=[http,communication]
will generate the following templates:Component Templates All the components that are needed to the generate the types described in the
type
list:ss4o_1.0_http_component
--> corresponds to thisss4o_1.0_communication_component
--> corresponds to thisIndex Templates
ss4o_1.0_logs-*-*
--> corresponds to thisMetadata -embedded
As part of the creation of the mapping template (both index and component) the schema metadata that specifies the logical schema origin and version is embedded into the mapping template structure under the
_meta
field:See examples :
http meta info
Naming-convention
Naming Index patterns will follow the next naming structure
prefix_{version}_{type}-{dataset}-{namespace}_suffix
version
- mandatory semantic version of the schema (1.0 / 1.2 )type
- indicated the observability high level types "logs", "metrics", "traces" (this is taken from the category within the catalog)dataset
- a field can contain anything that classify the source of the data - such as nginx - (If none specified*
will be used).namespace
- a user defined namespace - useful to allow grouping of data such as production grade, geography classificationPrefix - optional prefix name identifies the context of the entire naming Suffix - optional suffix name identifies the context of the specific index pattern (may be date or sequence )
The
sso_1.0_{type}-{dataset}-{namespace}
Pattern address the capability of differentiation of similar information structure to different indices accordingly to customer strategy.For example a customer may want to route the
nginx
logs from two geographical areas into two different indices:This type of distinction also allows for creation of crosscutting queries by setting the next index query pattern
sso_1.0_logs-nginx-*
or by using a geographic based crosscutting querysso_1.0_logs-*-eu
.Do you have any additional context?