Open Cobraeti opened 1 year ago
Hello, I also reproduced this behavior with the following chart versions (+ adding the corresponding appVersion for each):
Hello, here are some new elements... I checked also through the transform job creation webUI (Opensearch Dashboards > Index Management > Transform Jobs) for all the above versions and here are my results:
@timestamp
field and only aggregation is by countI'm really surprised this feature is not supported, as it is supposed to be available according to the documentation of all those versions:
All are stating the same:
Option Data Type Description Required groups Array Specifies the grouping(s) to use in the transform job. Supported groups are terms
,histogram
, anddate_histogram
. For more information, see Bucket Aggregations.Yes if not using aggregations.
Hello, it seems the issue is more the way parameters are documented and/or errors reported, as when removing all parameters a more useful error is shown:
PUT _plugins/_transform/agregate_level_1
{
"transform": {
"enabled": true,
"continuous": false,
"schedule": {
"interval": {
"period": 1,
"unit": "Minutes",
"start_time": 1602100553
}
},
"description": "Agregate data on 5min buckets",
"source_index": "opensearch_dashboards_sample_data_logs",
"target_index": "transform_level_1",
"data_selection_query": {
"match_all": {}
},
"page_size": 1,
"groups": [
{
"date_histogram": {}
}
],
"aggregations": {
"data_transfer": {
"sum": {
"field": "bytes"
}
}
}
}
}
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Source field must not be null"
}
],
"type" : "illegal_argument_exception",
"reason" : "Source field must not be null"
},
"status" : 400
}
It seems there was an attempt to document this, but being at the same level as groups
and aggregations
this line never comes (at least for me) as an expected replacement of the field
parameter of terms
, histogram
or date_histogram
:
Option Data Type Description Required source_field String The field(s) to transform. Yes
Maybe it should be more useful and obvious if described as an extra line in groups
description...
The target_field
and its behavior (if not specified, it equals source_field
from what I was able to test) should also be documented at the same place...
That said, the date_histgram
grouping is still not really behaving as expected, as @timstamp
is not allowed as source_field
:
{
"error" : {
"root_cause" : [
{
"type" : "status_exception",
"reason" : "Cannot find field [@timestamp] that can be grouped as [date_histogram] in [opensearch_dashboards_sample_data_logs]."
}
],
"type" : "status_exception",
"reason" : "Cannot find field [@timestamp] that can be grouped as [date_histogram] in [opensearch_dashboards_sample_data_logs]."
},
"status" : 400
}
And using other date fields found in the web logs data samples (timestamp
or utc_time
) just lead to target fields only recognized as number
, so not available as timestamp field when creating an index pattern for visualization in Opensearch Dashboards, which is not expected...
The official Elasticsearch implementation is not only way more user-friendly (less alteriation required to build the transform job creation request), but also leads to the expected target_field type, which is obviously date when you aggregate dates.... maybe a date range would be another acceptable type, but not a number...
Hello, any news about the date_histogram
not allowing @timestamp
as source field ? as this remains a bug related to date_histogram
while used in transform jobs to me...
The documentation issue was a first wall I faced to try using it, but the key issue here is date_histogram
not being able to use a date field as source... Kind of sad according to the name :sweat_smile:
This is the supported format for date_histogram in transform job:
"date_histogram": { "source_field": "timestamp", "calendar_interval": "minute" }
Hello @kvitali, Thanks for pointing the good format, though I won't be able to confirm it works, as we had to move to Elasticsearch since then... I guess this should be either:
field
becomes source_field
when used within the Tranform API
Describe the bug When trying to create a transform job on the "Sample web logs" index to aggregate on 5min buckets, I get the following response, even if the documentation states that date_histogram is available and has a "field" field:
To Reproduce Steps to reproduce the behavior (on Opensearch-Dashboards):
Note: the result is the same with
"field": "timestamp"
instead of"field": "@timestamp"
(both objects exist and have the same type/value)Expected behavior The transform job is created and produce a new index with buckets of 5min
Plugins
Screenshots NC
Host/Environment:
Additional context In the end I would even like to aggregate on more fields to have the following groups if it's allowed:
I can't upgrade to a much greater version than 1.3.10 for now, I'm bound to it because of my service provider... :disappointed: