Closed karthikeyan21 closed 3 days ago
Related to https://github.com/opensearch-project/index-management/pull/1040
@bowenlan-amzn @ikibo Could you please check?
Yes, I think this is a miss and causes a breaking change. Regarding this https://github.com/opensearch-project/index-management/pull/1040#discussion_r1401982311 , if user doesn't pass in start_time, the schedule.startTime will be null, and will cause the exception when instantialize the IntervalSchedule.
The solution is to add schedule.startTime ?: Instant.now()
back
@bowenlan-amzn so looks like schedule.startTime
is not a required field. What do you think, should it be a required field?
@mgodwan, thank U for this finding.
Good point, @bowenlan-amzn : the case when start_time
is not defined in the request must have been handled.
But the question is how?
according to the official rollup-api-doc schedule.interval.start_time
is a required field (@sarthakaggarwal97 FYI).
@bowenlan-amzn plz help me understand what would be the best way to handle this issue
start_time
is not defined (according to the doc)@bowenlan-amzn the same issue exists for the Transform job( the fix should be pretty much the same as for the roll-up). I think we can handle both under this ticket. Plz assign this issue to me.
@ikibo Thanks! The goal here is to not introduce breaking change. I think the documentation is wrong, obviously start_time is not a required field, as the example provided in this issue, if provided schedule like this
"schedule": {
"interval": {
"period": 60,
"unit": "Minutes"
}
},
rollup can be created before, and start time default to current time. so please go with the second path
handling null check as U suggest (in this case, I would suggest changing the doc to determine that start-time is set to current time if not set explicitly in the request, making it 'kind-of' not mandatory)
also link the transform change #1040
The workaround I used was replacing schedule.interval
by schedule.cron
. But I miss schedule.interval
a lot.
@louzadod this has been fixed 2.14
Hi. @bowenlan-amzn . Right after migration from 2.11 to 2.14, my rollup jobs configured with schedule.interval
stopped running. By replacing schedule.interval
with schedule.cron
it started running again.
@louzadod it's probably not the same issue. Do you want to report a bug with the error you saw and some reproduce steps maybe?
@bowenlan-amzn I'm getting the same error as reported in this bug and I'm running version 2.14.0.
GET /
{
"name": "logs-corporativos-client-2",
"cluster_name": "logs-corporativos",
"cluster_uuid": "rgFOp61cTRKts3oqa4dAwA",
"version": {
"distribution": "opensearch",
"number": "2.14.0",
"build_type": "tar",
"build_hash": "aaa555453f4713d652b52436874e11ba258d8f03",
"build_date": "2024-05-09T18:51:00.973564994Z",
"build_snapshot": false,
"lucene_version": "9.10.0",
"minimum_wire_compatibility_version": "7.10.0",
"minimum_index_compatibility_version": "7.0.0"
},
"tagline": "The OpenSearch Project: https://opensearch.org/"
}
Here is my rollup definition:
{
"rollup": {
"rollup_id": "vulner-history-job",
"enabled": true,
"schedule": {
"interval": {
"period": 1,
"unit": "Minutes"
}
},
"enabled_time": null,
"description": "Rollup job para sumarizar diariamente as vulnerabilidades",
"schema_version": 16,
"source_index": "vulnerabilities",
"target_index": "vulner-history",
"page_size": 1000,
"delay": 0,
"continuous": false,
"dimensions": [
{
"date_histogram": {
"fixed_interval": "1d",
"source_field": "timestamp",
"target_field": "timestamp",
"timezone": "America/Sao_Paulo"
}
},
{
"terms": {
"source_field": "severity",
"target_field": "severity"
}
},
{
"terms": {
"source_field": "stack_prefix",
"target_field": "stack_prefix"
}
},
{
"terms": {
"source_field": "stack",
"target_field": "stack"
}
},
{
"terms": {
"source_field": "service",
"target_field": "service"
}
}
],
"metrics": [
{
"source_field": "event_count",
"metrics": [
{
"sum": {}
}
]
}
]
}
}
After invoking the API for creating the rollup, here is the message:
{"error":{"root_cause":[{"type":"null_pointer_exception","reason":"Cannot invoke \"java.time.Instant.plusMillis(long)\" because \"startTime\" is null"}],"type":"null_pointer
_exception","reason":"Cannot invoke \"java.time.Instant.plusMillis(long)\" because \"startTime\" is null"},"status":500}
@louzadod just did a quick check. 2.14 didn't pick up this fix, it's in 2.15
ok. thanks for the confirmation, @bowenlan-amzn .
What is the bug? RollUp Job creation fails with 500 error code in Opensearch 2.12
Error Message :
{"error":{"root_cause":[{"type":"null_pointer_exception","reason":"Cannot invoke \"java.time.Instant.plusMillis(long)\" because \"startTime\" is null"}],"type":"null_pointer_exception","reason":"Cannot invoke \"java.time.Instant.plusMillis(long)\" because \"startTime\" is null"},"status":500}
How can one reproduce the bug? Steps to reproduce the behavior:
Use the below API to create a new RollUp Job - Create RollUp Job Sample -
curl -X PUT localhost:9200/_plugins/_rollup/jobs/test -H 'Content-Type:application/json' -d '{"rollup":{"target_index":"rollup_hourly_fmstats_test","description":"Hourly Stats Rollup","source_index":"test_*","enabled":true,"schedule":{"interval":{"period":60,"unit":"Minutes"}},"delay":0,"continuous":"true","metrics":[{"source_field":"abc.accepted","metrics":[{"max":{}}]},{"source_field":"abc.rejected","metrics":[{"max":{}}]},{"source_field":"abc.matched","metrics":[{"max":{}}]}],"page_size":5000,"dimensions":[{"date_histogram":{"fixed_interval":"60m","source_field":"timestamp"}},{"terms":{"source_field":"name"}}]}}'
RollUp Job creation is fails with error (500)
What is the expected behavior? RollUp Job to be created and data to be rolled up
What is your host/environment?
Do you have any screenshots? NA
Do you have any additional context? I was debugging the code and noticed that we have not initialised Schedule Modifying the code to Instant.now() instead of schedule.startTime fixed the issue
Update - This doesn't affect the existing RolUp Jobs. Any job created using earlier version (2.10) seems to be working as the time is initialised