opensearch-project / ml-commons

ml-commons provides a set of common machine learning algorithms, e.g. k-means, or linear regression, to help developers build ML related features within OpenSearch.
Apache License 2.0
99 stars 137 forks source link

[FEATURE] Stats REST API to get connector invocation metrics #2476

Open chishui opened 6 months ago

chishui commented 6 months ago

Is your feature request related to a problem? Today, the ML stats API only has total connector count metric, it doesn't report detailed connector invocation metrics. And the algorithm stats can't represent connector usage as for a single algorithm e.g. _predict, we could split docs into chunks and invoke connector multiple times and send multiple requests to remote server.

This connector metrics can help users to have a view on the count and success rate of the connector invocation. It also provides a way for us to have integration tests or end to end test to verify that we invoke the connector correctly.

What solution would you like? We can support new REST APIs:

  1. /_plugins/_ml/connectors/stats
  2. /_plugins/_ml/connectors/{connectorId}/stats

Whenever invoke a connector, record their count and failure count.

What alternatives have you considered? N/A

Do you have any additional context? N/A

dhrubo-os commented 6 months ago

Could you please check profile api if that satisfies your requirement. We have model level prediction count.

chishui commented 6 months ago

the count field in profile API is the same with ml_action_request_count of _ml/stats, both measure the model level not the connector level

dblock commented 5 months ago

Catch All Triage - 1 2 3 4 5 6

chishui commented 4 months ago

@ylwu-amzn @dhrubo-os I can pick this up if nobody is going to

dhrubo-os commented 4 months ago

Whenever invoke a connector, record their count and failure count.

How are you planning to persist this information? In memory or in index?

Is there customer ask for this feature? I'm just wondering if that could be a redundant information?

We have model prediction count in the stats api. Does that satisfy this need?

chishui commented 4 months ago

After we measure the model prediction count here, documents could be chunked and there could be multiple requests sent to remote ML servers for a single prediction action. With ml/stats and profile API, we could only get prediction number, the actual number that we invoke connector is missing.

Is there customer ask for this feature?

I didn't see a customer asking for this feature. I personally feel this is an important information which is not provided yet. User can have a sense on how many times connectors are invoked or failed.

How are you planning to persist this information? In memory or in index?

Need a design, maybe it's more appropriate to be part of the current _ml/stats API or profile API

pyek-bot commented 1 month ago

Can you assign this to me please?