elastic / elastic-package

elastic-package - Command line tool for developing Elastic Integrations
Other
49 stars 113 forks source link

[TSDB] [Documentation] Annotate fields that are dimensions in the table showing Exported Fields in Integration #1111

Open agithomas opened 1 year ago

agithomas commented 1 year ago

For a user, a knowledge on the fields that are dimensions is important such as to write queries involving aggregation

image

Reference : https://github.com/elastic/integrations/tree/main/packages/oracle/docs

Adding one more column would further limit the space available for showing Descriptions. Column - "Type" have values such as "keyword, dimension" would help.

Request is to enhance elastic-package tool to capture the dimension fields and create/update auto-generated documentation

elasticmachine commented 1 year ago

Pinging @elastic/fleet (Team:Fleet)

mlunadia commented 1 year ago

@elastic/fleet can we prioritize this issue to help us create documentation that can help users easily identify TSDB dimensions should they want to create new queries or migrate custom ones.

felixbarny commented 1 year ago

Related:

For a user, a knowledge on the fields that are dimensions is important such as to write queries involving aggregation

Could you elaborate on why users need to know which fields are dimensions when they want to do aggregations?

lalit-satapathy commented 1 year ago

For completeness, can we add both metric types and dimension columns. Our packages, field documentation is auto-generated, will prefer this to be complete and accurate from TSDB perspective.

Screenshot 2023-05-25 at 7 17 21 PM
juliaElastic commented 1 year ago

@jlind23 @jsoriano FYI this request is coming from the TSDB project to document dimension fields in elastic-package.

Though the screenshots look like the Integrations UI in Kibana, are we saying to change that with a new column?

jsoriano commented 1 year ago

Though the screenshots look like the Integrations UI in Kibana, are we saying to change that with a new column?

I think that the integrations UI is only rendering the packages READMEs, this column would need to be added on these READMEs, that are generated by a helper in elastic-package (around here). Changing this helper will require to re-render all readmes of packages with dimensions (or metric types and so on if added too).

Though I agree with Felix' question in https://github.com/elastic/elastic-package/issues/1111#issuecomment-1562836278, I would like to know more about why this needs to be exposed to users.

lalit-satapathy commented 1 year ago

Changing this helper will require to re-render all readmes of packages with dimensions (or metric types and so on if added too).

Is it feasible to change the rendering only if _index.mode is set to timeseries for a data stream otherwise keep the old format?

jsoriano commented 1 year ago

Changing this helper will require to re-render all readmes of packages with dimensions (or metric types and so on if added too).

Is it feasible to change the rendering only if _index.mode is set to timeseries for a data stream otherwise keep the old format?

Yes, this would be possible.

agithomas commented 1 year ago

With adding two columns (metric_type & dimension indicators) , in integrations UI , would there be adequate width left for the "Description" column ? Especially if the plan is to write the value in plain text format. (gauge , counter, dimension field, etc)

Currently, description column displays around 5 words per line. Adding two more plain text based column would further reduce the width of the column. The side effect not only is readbility, but also increases. the amount of page scroll a user has to do to reach the end of the page.

Should a different rendering of the table or description column be considered ? Alternate option is to use an icon to describe metric_type and if the field is a time_series_dimension field?

nimarezainia commented 1 year ago

@jsoriano what is your opinion on the final outcome? any chance you could attach an estimate to this. (thank you)

jsoriano commented 1 year ago

what is your opinion on the final outcome?

I still think that we need an answer for the question in https://github.com/elastic/elastic-package/issues/1111#issuecomment-1562836278. This is a very good point, because dimensions should be transparent for most users. Once this is better understood, we can decide if this actually needs to go to the documentation, and/or if we should somehow highlight dimensions in other UIs. It is the same with metric types, units and so on, they give information to the stack about how to use the data, so the user doesn't need to care about this.

I also agree with Agi on his comment https://github.com/elastic/elastic-package/issues/1111#issuecomment-1564120758, adding more columns to these tables would worsen their readability. I wouldn't use icons there because this is markdown intended to be rendered anywhere, and we would need to somehow include these icons. Maybe an option to save space is to list, under the table, the dimension fields.

any chance you could attach an estimate to this.

Estimated as medium. The change in elastic-package is probably small, but it will change the readmes of some packages in the integrations repository and coordinating this kind of changes usually takes some time.

nimarezainia commented 11 months ago

I still think that we need an answer for the question in #1111 (comment). This is a very good point, because dimensions should be transparent for most users. Once this is better understood, we can decide if this actually needs to go to the documentation, and/or if we should somehow highlight dimensions in other UIs. It is the same with metric types, units and so on, they give information to the stack about how to use the data, so the user doesn't need to care about this.

@agithomas could you please help us understand this a bit better? As Jaime and Felix have mentioned dimensions should be transparent. We are trying to see how to prioritize this request and if it is indeed high priority.

cc: @mlunadia @ruflin @andresrc for visibility.

agithomas commented 11 months ago

From the context of Observability metrics, the dimensions have far more value, i think. Please refer here to AWS documentation on Dimensions. "Dimensions are categories that describe the characteristics of metrics. You can use dimensions to filter the results that CloudWatch returns."

AWS has great documentation explaining each Field and its related dimensions. Example : Dynamo DB. But, not all cloud providers have great documentation as AWS. But, they do organise them separately. Azure Example

During the TSDB enablement effort, we identified dimensions across all product metrics (cloud products, managed services, on-prem) and recorded them under datastream / dataset level. When dimension information is easily accessible, users of Elastic Observability can determine metric characteristics and perform data filtering using the dimension information, similar to AWS metrics.

I believe, there is a thought process or effort to consider every Keyword field as a Dimension field. I believe this is from the learning that almost all keyword fields (except ECS common dimension fields) became dimension fields. However, there is another category of non-keyword type fields that does not have any metric-type mapping associated with it for example process IDs, port number, IP address) also become dimension fields.

So, when it comes to organising the data in the UI, an approach that can be considered is to have separate tables representing

  1. fields having metric_type mapping. No need to have dimension fields.
  2. fields do not have metric_type mapping. These fields will be keyword fields, fields representing IDs. Have dimension fields. No need to have unit & metric_type column.

By the end of TSDB enablement, it is expected that metric_type mapping of all datastreams is complete.

This assists users in having a separation of value fields from metadata fields (dimension fields) similar to how AWS and Azure documentation organises the data.

agithomas commented 11 months ago

Adding @tommyers-elastic for visibility.

jsoriano commented 11 months ago

@agithomas yes, the great value of dimensions is clear.

What is still not so clear to me (following with https://github.com/elastic/elastic-package/issues/1111#issuecomment-1562836278 and https://github.com/elastic/elastic-package/issues/1111#issuecomment-1571834644) is if users need to know what fields are dimensions to leverage them. In principle this should be transparent for users, at least for most use cases.

Regarding the UI, your proposal on having different can be a good option. This way the table with metrics and dimensions information can be hidden by default, so users don't need to care about this except for advanced use cases.

jsoriano commented 11 months ago

Discussed with Agi on Slack about why users need to know which fields are dimensions. The main point would be that it would help users to use the proper fields when querying their data. It would help them to use fields that are known to uniquely identify resources, and avoid mistakes using fields that don't do it.

@felixbarny would this answer your question in https://github.com/elastic/elastic-package/issues/1111#issuecomment-1562836278?

felixbarny commented 11 months ago

Partially yes. I think that in the future, we won't be explicitly hand-picking which fields are dimensions and which aren't but instead rely on https://github.com/elastic/elasticsearch/issues/98384 so that everything that's not a metric will be considered a dimension by default.

Still, being able to differ what's a dimension vs metrics will be important in the future. I don't think that the integration documentation will be the most critical place users will rely on to look that up. Instead, the UI should be able to highlight the fields that are dimensions, which it already does as of 8.10: https://github.com/elastic/elasticsearch/issues/98384.

It doesn't hurt to document dimension fields in the integrations, though. However, I don't think this is a high priority task.

agithomas commented 11 months ago

If UI representation helps users identify a dimension field, it gives the best experience to the user. With the UI indicating dimension fields, this is not a high priority task.

Thanks @felixbarny , @jsoriano for sharing your inputs.