elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.82k stars 8.2k forks source link

[APM] Cannot create ML jobs for services with spaces in their name #62370

Closed Jaraxal closed 4 years ago

Jaraxal commented 4 years ago

Kibana version: 7.6.1

Elasticsearch version: 7.6.1

Server OS version: Elastic Cloud (on GCP)

Browser version: Microsoft Edge Version 80.0.361.109 (Official build) (64-bit)

Browser OS version: Mac OS X 10.14.6

Original install method (e.g. download page, yum, from source, etc.): Installed via Elastic Cloud

Describe the bug: When I try to create an ML job through the APM UI "Integrations" dropdown, an ML job is not created as expected. The blue button labeled "Create new job" goes from blue to gray and the popout window appears to get stuck.

Steps to reproduce:

  1. Navigate to the APM UI.
  2. Select a service. I have tried multiple instrumented services; they all fail.
  3. Select the "Integrations" dropdown.
  4. Select "Enable ML anomaly detection" option.
  5. Select "Create new job".

Expected behavior: The expected behavior is an ML job should be created and the popout window should go away.

Screenshots (if relevant):

Screen Shot 2020-04-02 at 12 49 41 PM

Errors in browser console (if relevant):

Screen Shot 2020-04-02 at 1 57 07 PM

Provide logs and/or server output (if relevant): Logs not available from Elastic Cloud

Any additional context: Machine Learning is enabled with existing ML jobs working fine. A platinum license is present.

elasticmachine commented 4 years ago

Pinging @elastic/apm-ui (Team:apm)

dgieselaar commented 4 years ago

@Jaraxal I've created a 7.6.1 deployment on cloud and indexed some APM data, but I'm not able to reproduce this. Any ML jobs are successfully created. Are you still running into this?

Jaraxal commented 4 years ago

@dgieselaar Yes, I'm still seeing this behavior. Doesn't matter which APM service I try to create an ML job for, it has the same behavior.

dgieselaar commented 4 years ago

There should be a request to a Kibana ML endpoint in your network tab. Can you check the response? (both status and content).

Jaraxal commented 4 years ago

@dgieselaar The request under the network tab is:

https://94256e2d96d74f64868c76c0183db747.us-central1.gcp.cloud.es.io:9243/api/ml/anomaly_detectors/nodejs%20rest%20api-request-high_mean_response_time

The response was {"statusCode":404,"error":"Not Found","message":"[resource_not_found_exception] No known job with id 'nodejs rest api-request-high_mean_response_time'"}

My service names have spaces in them. Is that a potential issue?

dgieselaar commented 4 years ago

Could be! That is the request to search for any existing jobs. There should be another that attempts to create one, a POST request. Is it not there?

Jaraxal commented 4 years ago

The 404 above is when I first click on the Integrations -> Enable ML anomaly detection. When I then click on the Create job button, I see the following:

https://94256e2d96d74f64868c76c0183db747.us-central1.gcp.cloud.es.io:9243/api/apm/settings/apm-indices

{"apm_oss.sourcemapIndices":"apm-*","apm_oss.errorIndices":"apm-*","apm_oss.onboardingIndices":"apm-*","apm_oss.spanIndices":"apm-*","apm_oss.transactionIndices":"apm-*","apm_oss.metricsIndices":"apm-*","apmAgentConfigurationIndex":".apm-agent-configuration"}

Followed by:

https://94256e2d96d74f64868c76c0183db747.us-central1.gcp.cloud.es.io:9243/api/ml/modules/setup/apm_transaction

{"statusCode":404,"error":"Not Found","message":"Not Found"}

Jaraxal commented 4 years ago

Here is more detailed info from the second link above (apm_transaction):

{jobs: [{id: "python wmata app-bus-incidents-high_mean_response_time", success: false, error: {,…}}],…}
datafeeds: [,…]
0: {id: "datafeed-python wmata app-bus-incidents-high_mean_response_time", success: false, started: false,…}
error: {,…}
body: "{"job_id":"python wmata app-bus-incidents-high_mean_response_time","indices":["apm-*"],"query":{"bool":{"filter":[{"term":{"service.name":"Python WMATA App"}},{"term":{"processor.event":"transaction"}},{"term":{"transaction.type":"bus-incidents"}}]}}}"
msg: "[status_exception] Invalid datafeed_id; 'datafeed-python wmata app-bus-incidents-high_mean_response_time' can contain lowercase alphanumeric (a-z and 0-9), hyphens or underscores; must start and end with alphanumeric"
path: "/_ml/datafeeds/datafeed-python%20wmata%20app-bus-incidents-high_mean_response_time"
query: {}
response: "{"error":{"root_cause":[{"type":"status_exception","reason":"Invalid datafeed_id; 'datafeed-python wmata app-bus-incidents-high_mean_response_time' can contain lowercase alphanumeric (a-z and 0-9), hyphens or underscores; must start and end with alphanumeric"}],"type":"status_exception","reason":"Invalid datafeed_id; 'datafeed-python wmata app-bus-incidents-high_mean_response_time' can contain lowercase alphanumeric (a-z and 0-9), hyphens or underscores; must start and end with alphanumeric"},"status":400}"
statusCode: 400
id: "datafeed-python wmata app-bus-incidents-high_mean_response_time"
started: false
success: false
jobs: [{id: "python wmata app-bus-incidents-high_mean_response_time", success: false, error: {,…}}]
0: {id: "python wmata app-bus-incidents-high_mean_response_time", success: false, error: {,…}}
error: {,…}
body: "{"job_type":"anomaly_detector","groups":["apm","python wmata app","bus-incidents"],"description":"Detect anomalies in high mean of transaction duration","analysis_config":{"bucket_span":"15m","detectors":[{"detector_description":"high_mean(\"transaction.duration.us\")","function":"high_mean","field_name":"transaction.duration.us"}],"influencers":[]},"analysis_limits":{"model_memory_limit":"10mb"},"data_description":{"time_field":"@timestamp"},"model_plot_config":{"enabled":true},"custom_settings":{"created_by":"ml-module-apm-transaction"}}"
msg: "[illegal_argument_exception] Invalid job_id; 'python wmata app-bus-incidents-high_mean_response_time' can contain lowercase alphanumeric (a-z and 0-9), hyphens or underscores; must start and end with alphanumeric"
path: "/_ml/anomaly_detectors/python%20wmata%20app-bus-incidents-high_mean_response_time"
query: {}
response: "{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Invalid job_id; 'python wmata app-bus-incidents-high_mean_response_time' can contain lowercase alphanumeric (a-z and 0-9), hyphens or underscores; must start and end with alphanumeric"}],"type":"illegal_argument_exception","reason":"Invalid job_id; 'python wmata app-bus-incidents-high_mean_response_time' can contain lowercase alphanumeric (a-z and 0-9), hyphens or underscores; must start and end with alphanumeric"},"status":400}"
statusCode: 400
id: "python wmata app-bus-incidents-high_mean_response_time"
success: false
kibana: {}

Looks like the spaces are a problem in the name of my services.

dgieselaar commented 4 years ago

Right, so my best guess right now is that we are not properly encoding the service name before sending it to the ML API. I'll try to confirm, if that is the case I will file a bug.

Op wo 15 apr. 2020 19:45 schreef Michael Young notifications@github.com:

Here is more detailed info from the second link above (apm_transaction):

{jobs: [{id: "python wmata app-bus-incidents-high_mean_response_time", success: false, error: {,…}}],…}

datafeeds: [,…]

0: {id: "datafeed-python wmata app-bus-incidents-high_mean_response_time", success: false, started: false,…}

error: {,…}

body: "{"job_id":"python wmata app-bus-incidents-high_mean_response_time","indices":["apm-*"],"query":{"bool":{"filter":[{"term":{"service.name":"Python WMATA App"}},{"term":{"processor.event":"transaction"}},{"term":{"transaction.type":"bus-incidents"}}]}}}"

msg: "[status_exception] Invalid datafeed_id; 'datafeed-python wmata app-bus-incidents-high_mean_response_time' can contain lowercase alphanumeric (a-z and 0-9), hyphens or underscores; must start and end with alphanumeric"

path: "/_ml/datafeeds/datafeed-python%20wmata%20app-bus-incidents-high_mean_response_time"

query: {}

response: "{"error":{"root_cause":[{"type":"status_exception","reason":"Invalid datafeed_id; 'datafeed-python wmata app-bus-incidents-high_mean_response_time' can contain lowercase alphanumeric (a-z and 0-9), hyphens or underscores; must start and end with alphanumeric"}],"type":"status_exception","reason":"Invalid datafeed_id; 'datafeed-python wmata app-bus-incidents-high_mean_response_time' can contain lowercase alphanumeric (a-z and 0-9), hyphens or underscores; must start and end with alphanumeric"},"status":400}"

statusCode: 400

id: "datafeed-python wmata app-bus-incidents-high_mean_response_time"

started: false

success: false

jobs: [{id: "python wmata app-bus-incidents-high_mean_response_time", success: false, error: {,…}}]

0: {id: "python wmata app-bus-incidents-high_mean_response_time", success: false, error: {,…}}

error: {,…}

body: "{"job_type":"anomaly_detector","groups":["apm","python wmata app","bus-incidents"],"description":"Detect anomalies in high mean of transaction duration","analysis_config":{"bucket_span":"15m","detectors":[{"detector_description":"high_mean(\"transaction.duration.us\")","function":"high_mean","field_name":"transaction.duration.us"}],"influencers":[]},"analysis_limits":{"model_memory_limit":"10mb"},"data_description":{"time_field":"@timestamp"},"model_plot_config":{"enabled":true},"custom_settings":{"created_by":"ml-module-apm-transaction"}}"

msg: "[illegal_argument_exception] Invalid job_id; 'python wmata app-bus-incidents-high_mean_response_time' can contain lowercase alphanumeric (a-z and 0-9), hyphens or underscores; must start and end with alphanumeric"

path: "/_ml/anomaly_detectors/python%20wmata%20app-bus-incidents-high_mean_response_time"

query: {}

response: "{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Invalid job_id; 'python wmata app-bus-incidents-high_mean_response_time' can contain lowercase alphanumeric (a-z and 0-9), hyphens or underscores; must start and end with alphanumeric"}],"type":"illegal_argument_exception","reason":"Invalid job_id; 'python wmata app-bus-incidents-high_mean_response_time' can contain lowercase alphanumeric (a-z and 0-9), hyphens or underscores; must start and end with alphanumeric"},"status":400}"

statusCode: 400

id: "python wmata app-bus-incidents-high_mean_response_time"

success: false

kibana: {}

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/elastic/kibana/issues/62370#issuecomment-614183116, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACWDXFRKR4KJFPQDZLO2CTRMXXD3ANCNFSM4L23BLXA .

dgieselaar commented 4 years ago

@Jaraxal This should be fixed when https://github.com/elastic/kibana/pull/63683 lands. Again thanks for reporting this :)