Open e-emoto opened 2 months ago
Here are some more example use cases of the APIs:
GET /_all/_tiering?state=active
{
"test1": {
"source": "hot",
"target": "warm",
"state": "active",
"duration": "10:00:00",
},
"test2": {
"source": "warm",
"target": "hot",
"state": "active",
"duration": "01:00:00",
},
...
}
/_cat/tiering?state=active
test1 | hot | warm | active | 10:00:00
test2 | warm | hot | active | 01:00:00
/_cat/tiering?state=active&v=true
index | source | target | state | duration
test1 | hot | warm | active | 10:00:00
test2 | warm | hot | active | 01:00:00
GET /_all/_tiering?source=hot&target=warm&state=failed
{
"test3": {
"source": "hot",
"target": "warm",
"state": "failed",
"duration": "11:00:00",
},
"test4": {
"source": "hot",
"target": "warm",
"state": "failed",
"duration": "20:00:00",
},
...
}
/_cat/tiering?source=hot&target=warm&state=failed
test3 | hot | warm | failed | 11:00:00
test4 | hot | warm | failed | 20:00:00
/_cat/tiering?source=hot&target=warm&state=failed&v=true
index | source | target | state | duration
test3 | hot | warm | failed | 11:00:00
test4 | hot | warm | failed | 20:00:00
GET /target_index/_tiering?detailed=true
{
"target_index": {
"source": "hot",
"target": "warm",
"state": "active",
"start_time": "2024-06-27T00:00:00Z",
"duration": "10:00:00",
"shards": {
"total": 10,
"successful": 4,
"failed": 0,
"active": 6,
},
}
}
/_cat/tiering?index=target_index&h=index,source,target,state,start_time,duration,shards_total,shards_successful,shards_active,shards_failed
target_index | hot | warm | active | 2024-06-27T00:00:00Z | 10:00:00 | 10 | 4 | 6 | 0
/_cat/tiering?index=target_index&h=index,source,target,state,start_time,duration,shards_total,shards_successful,shards_active,shards_failed&v=true
index | source | target | state | start_time | duration | shards_total | shards_successful | shards_active | shards_failed
target_index | hot | warm | active | 2024-06-27T00:00:00Z | 10:00:00 | 10 | 4 | 6 | 0
Thanks @e-emoto for sharing the proposal. It looks good overall. Just few minor comments:
state = failed / active
, does ongoing
or in_progress
makes more sense than active
?verbose
as query parameter instead of detailed
?start_time
without detailed/verbose flag? this way verbose/detailed would mean shard level details of tiering._cat
api to get ongoing tierings? looks like both APIs are gonna provide same information.Get Tiering Metadata from Cluster State
as we will be using the metadata stored by tiering service?detailed
vs. explain
, https://github.com/search?q=repo%3Aopensearch-project%2Fopensearch-api-specification%20detailed&type=code, I am not sure what's right. Please don't add a verbose
unless it's in the cat
api which has verbose
, https://github.com/search?q=repo%3Aopensearch-project%2Fopensearch-api-specification+verbose&type=codeThanks for the comments @harishbhakuni
state = failed / active
, doesongoing
orin_progress
makes more sense thanactive
?
We discussed it and decided that saying active
was more clear than ongoing
to include pending states too
- Also, can we use
verbose
as query parameter instead ofdetailed
?
We decided to use detailed
to make it consistent with other APIs
- Also, can we provide
start_time
without detailed/verbose flag? this way verbose/detailed would mean shard level details of tiering.
We're trying to keep the not detailed
simple, so I don't know if we need to include the start time since that can be gauged from the duration
- Why do we need
_cat
api to get ongoing tierings? looks like both APIs are gonna provide same information.
The _cat
API provides the information in a tabular format, which could be easier to read in some cases
- Also should it be
Get Tiering Metadata from Cluster State
as we will be using the metadata stored by tiering service?
I think this is a good suggestion, I'll update the name
Thanks for your response @dblock
- Check whether states should be lowercase or uppercase, we're all over the place, https://github.com/search?q=repo%3Aopensearch-project%2Fopensearch-api-specification%20enum&type=code.
I'll check what we're doing for tiering states and try to make it consistent with that
- We have times and durations with units, please be consistent, https://github.com/search?q=repo%3Aopensearch-project%2Fopensearch-api-specification%20_time&type=code
I think this is a good point, we can change it to duration_in_millis
to make it consistent with other time measurements
- We have an inconsistent use of
detailed
vs.explain
, https://github.com/search?q=repo%3Aopensearch-project%2Fopensearch-api-specification%20detailed&type=code, I am not sure what's right. Please don't add averbose
unless it's in thecat
api which hasverbose
, https://github.com/search?q=repo%3Aopensearch-project%2Fopensearch-api-specification+verbose&type=code
We're using detailed
for the GET API, and v
as the parameter for the _cat
API
- Please review the other fields and attempt some consistency :)
I'll review the other fields too
We're using detailed for the GET API, and v as the parameter for the _cat API
This sounds confusing to me. Let's see if I am on the same page here.
According to this proposal the detailed
parameter in the REST API will include more fields (and/or nested objects) into returned JSON response. This is similar to REST API for Cluster health that can include level
parameter. For example http://localhost:9200/_cluster/health?level=indices will include indices
object with more detailed breakdown of individual indices and corresponding health status of it.
However, in _cat API the v
parameter has completely different role. It does not change the number of columns included into the response. It adds a header row.
Perhaps the short sentence is just missing more context information thus I am confused :-)
One more detail, the proposal discusses two cases of a node receiving the REST API request: a) the node is a cluster manager, or b) the node is a data node. I think it is just a detail but if the receiving node is neither of these, for example if it is just a search
-ing node AND the _local
option is used then the response should not include any indices, right?
Is your feature request related to a problem? Please describe
The Status API in Tiering will be for listing the in-progress and failed index tierings. Since the Tiering project is still developing, the API should be extensible to cover new cases such as dedicated and non-dedicated warm node clusters. The design explanations here focus on the hot to warm case, but take the future use of the API into consideration.
Describe the solution you'd like
API Models:
The API will use a source and target as input to filter which tierings are shown. It will validate that both inputs are valid tiers, and then use them to find any tierings that match the described type. The API should still work if only one of the source or target is given, and will find any tierings with that input, allowing for more flexible queries. In the default case if no source or target is given as input, the status API should return all in progress or failed tierings for the specified indices, regardless of the tiering change. There will be two APIs for status: a
GET
API and a_cat
API.GET API:
API Request:
GET /<indexNameOrPattern>/_tiering?source=hot&target=warm
GET /<indexNameOrPattern>/_tiering?state=active
GET /<indexNameOrPattern>/_tiering?detailed=true/false
The
GET
API would have a few parameters. The index name in the path will be required, but can support using_all
or*
to get migrations from all indices that match the parameters. The API will also support comma separated index names.API Parameters:
source = hot / warm
(optional, no default)target = hot / warm
(optional, no default)The values for the
source
andtarget
parameters are the tiers, withsource
being the tier the index started in andtarget
being the tier it is moving to.state = failed / active
(optional, no default)The values of the
state
parameter represent the state of the tiering.failed
indicates that the tiering has failed andactive
means the tiering process is in progress.detailed = true / false
(default false)The
detailed
parameter determines whether theGET
API response should include details like the shard relocation status and tiering start time.local = true / false
(optional, default false)The
local
parameter determines where the request retrieves information from. If true, it is from a data node, if false, it is from the master node.API Response:
Success:
Failure:
_cat API:
API Request:
/_cat/tiering?source=hot&target=warm
/_cat/tiering?state=active
The
_cat
API would have some of the same parameters asGET
, but would also have additional parameters for formatting and filtering the response columns.API Parameters:
source = hot / warm
(optional, no default)target = hot / warm
(optional, no default)The values for the
source
andtarget
parameters are the tiers, withsource
being the tier the index started in andtarget
being the tier it is moving to.state = failed / active
(optional, no default)The values of the
state
parameter represent the state of the tiering.failed
indicates that the tiering has failed andactive
means the tiering process is in progress.index = index1,index2,...
(optional, default _all)The
index
is a comma separated list of index names used to filter the responses.h = index,source,target,status,start_time,failure_time,duration,shards_total,shards_successful,shards_active,shards_failed
(optional, no default)The
h
parameter is only for the_cat
API, and it would be used to filter which columns are shown in the response. If this parameter is not passed to the API call, then it will show just theindex
,source
,target
,state
, andduration
columns by default.index
- The index namesource
- The tier the index starts intarget
- The tier the index is moving tostate
- The current state of the tieringstart_time
- The timestamp of when the tiering startedduration
- The duration of the tiering, if it failedshards_total
- The total number of shardsshards_successful
- The number of shards that succeeded tieringshards_active
- The number of shards where tiering is still ongoingshards_failed
- The number of shards that failed tieringv = true / false
(optional, default false)If the
v
parameter is true, the response will include the column labels as the first row of the response.s = index,source,target,state,start_time,...
(optional, no default)The
s
parameter is a comma separated list of column names used to sort the rows in the response.API Response:
Success:
Failure:
Design: Get Tiering Metadata from Cluster State
Since both the status
GET
and_cat
APIs contain mostly the same information but just present it in different formats with slightly different ways for the customer to interact with them, they can both evaluate the status and retrieve the information using the same design.In this design, the tiering service would store some tiering metadata in the cluster state, and then when the status API is called it would use the tiering metadata to create its response. The migration status is stored in the index settings by the tiering service, while other information like the tiering start time is stored in the index metadata. The status API can use this information from the index settings and metadata to evaluate the tiering status when it is called. Since this information is in the cluster state, it would be relatively fast for the status API to access it. Also, because the cluster state is available from the master node and data nodes, the status API would be able to be called on either type of node.
In the dedicated warm node setup, we could also use the cluster state to check the shard status and determine the tiering progress. However, for the non-dedicated warm node setup, we would need to find another way to check the tiering progress. We could do so by communicating with other nodes through the transport layer to use a service on the data nodes that checks if shards are complete, in-progress, or failed when the status API is called. Then we could use that shard information to fill out details in the verbose response.
Another option that was considered for shard relocation status in the non-dedicated setup was storing the shard level data locality in the tiering metadata. However, this would require frequent cluster state updates to refresh the values of these fields. This would be very costly when accounting for all the shards across all indices that have ongoing tiering.
Order of Operations:
TIERED_REMOTE_INDEX
is enableddetailed
:Design: Get Tiering Metadata from Cluster State
Pros:
Cons:
Related component
Search:Remote Search
Describe alternatives you've considered
No response
Additional context
Related issues: https://github.com/opensearch-project/OpenSearch/issues/14640 https://github.com/opensearch-project/OpenSearch/issues/14679 https://github.com/opensearch-project/OpenSearch/issues/13294