splunk / kafka-connect-splunk

Kafka connector for Splunk
Apache License 2.0
94 stars 102 forks source link

REST API validate endpoint fails with unexpected error 500 #421

Closed vdesabou closed 8 months ago

vdesabou commented 11 months ago

I noticed a difference of behaviour when REST API validate endpoint is used between 2.0.9 and 2.0.10+ versions

When missing mandatory is not set, validate in 2.0.9 returns expected results, i.e 200 OK with error_count set, etc.. With 2.0.10+ it now returns 500 error The change of behaviour between 2.0.9 and 2.1.0 is caused by this change made in https://github.com/splunk/kafka-connect-splunk/pull/365

My understanding is that validate is not supposed to return error 500 but rather 200 with error_count set.

rishabhbits038 commented 8 months ago

@VihasMakwana Could you please prioritize this? This is a breaking change

elekanne commented 8 months ago

I do have a customer where they used this connector already, but due to this error they removed it from the available connectors and now application teams are waiting before they can start sending to Splunk. For this customer it is urgent as we have some Splunk use casus with this connector that has to be finished before 1st April.

VihasMakwana commented 8 months ago

@elekanne @rishabhbits038 you can set splunk.validation.disable to true and the connector will behave like previous versions.

VihasMakwana commented 8 months ago

Refer https://github.com/splunk/kafka-connect-splunk/tree/develop?tab=readme-ov-file#general-optional-parameters

vdesabou commented 8 months ago

@VihasMakwana I've tested your suggestion but unfortunately it does not work for me

[2024-03-06 17:05:42,540] ERROR Uncaught exception in REST call to /connector-plugins/com.splunk.kafka.connect.SplunkSinkConnector/config/validate (org.apache.kafka.connect.runtime.rest.errors.ConnectExceptionMapper:64)
java.util.concurrent.ExecutionException: org.apache.kafka.common.config.ConfigException: Missing required configuration "splunk.hec.uri" which has no default value.
    at org.apache.kafka.connect.util.ConvertingFutureCallback.result(ConvertingFutureCallback.java:123)
    at org.apache.kafka.connect.util.ConvertingFutureCallback.get(ConvertingFutureCallback.java:115)
    at org.apache.kafka.connect.runtime.rest.resources.ConnectorPluginsResource.validateConfigs(ConnectorPluginsResource.java:109)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52)
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:134)
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:177)
    at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:219)
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:81)
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:475)
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:397)
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:81)
    at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:255)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:292)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:274)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:244)
    at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265)
    at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:234)
    at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:684)
    at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:394)
    at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:346)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:358)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:311)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:205)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:554)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1440)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:505)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1355)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:234)
    at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:181)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
    at org.eclipse.jetty.server.Server.handle(Server.java:516)
    at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:487)
    at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:732)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:479)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
    at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:137)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.kafka.common.config.ConfigException: Missing required configuration "splunk.hec.uri" which has no default value.
    at org.apache.kafka.common.config.ConfigDef.parseValue(ConfigDef.java:518)
    at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:508)
    at org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:112)
    at org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:132)
    at com.splunk.kafka.connect.SplunkSinkConnectorConfig.<init>(SplunkSinkConnectorConfig.java:274)
    at com.splunk.kafka.connect.SplunkSinkConnector.validateSplunkConfigurations(SplunkSinkConnector.java:150)
    at com.splunk.kafka.connect.SplunkSinkConnector.validate(SplunkSinkConnector.java:112)
    at org.apache.kafka.connect.runtime.AbstractHerder.validateConnectorConfig(AbstractHerder.java:592)
    at org.apache.kafka.connect.runtime.AbstractHerder.lambda$validateConnectorConfig$6(AbstractHerder.java:470)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    ... 1 more

As you can see from stack trace, it does not even reach the if (connectorConfig.disableValidation) check:

    at com.splunk.kafka.connect.SplunkSinkConnectorConfig.<init>(SplunkSinkConnectorConfig.java:274)
    at com.splunk.kafka.connect.SplunkSinkConnector.validateSplunkConfigurations(SplunkSinkConnector.java:150)

I guess the check should be done earlier? Before calling validateSplunkConfigurations

What do you think ?

VihasMakwana commented 8 months ago

@vdesabou @ludovic-boutros @rishabhbits038 The proposed fix bypasses mandatory field validation, which will likely cause issues later during runtime. Let's take an example here.

Please take a look at the required parameters here. If we don't set URI or token or index and bypass the check, this would cause an issue in runtime and events will not be sent to Splunk.

Let me know what you think.

vdesabou commented 8 months ago

The problem is that validate endpoint is not supposed to return anything apart 200 with error_count being set, this was the case with earlier versions (here 2.0.9) and users want to get same behaviour as before:

CleanShot 2024-03-07 at 15 06 07

rishabhbits038 commented 8 months ago

@VihasMakwana There are 2 issues here:

  1. Like @vdesabou said, in case a mandatory field, the return code should've been 200 instead of 500 with the error count being set. I think this is what the root cause is. Connector's validate shouldn't return a RuntimeException If 1 is fixed, we shouldn't need to use splunk.validation.disable.
  2. With splunk.validation.disable, validations are still kicking in, for which change by @ludovic-boutros should help. Its a different issue that the error would be thrown later during runtime
rishabhbits038 commented 8 months ago

The issue is that a unhandledConfig exception is being thrown. The error message should have been set in the ConfigDef corresponding to hec.uri like this

It wasn't happening with an older version because we were returning return new Config(validations); inside the public Config validate(final Map<String, String> connectorConfigs) { method.

But with the new change, we are ignoring the result of validations = config.configValues(); and proceeding ahead with connectivity check even when the field is not present.

I would suggest we only proceed with validity checks if there are no validation errors from Config config = super.validate(connectorConfigs);. When there are no configuration errors, we should proceed with validateSplunkConfigurations and handle the connectivity exception by updating the configDef fields URI_CONF and TOKEN_CONF with the appropriate error message.

elekanne commented 8 months ago

I know there was weekend in between, but this is a blocking issue for the customer that Vincent @vdesabou is also (indirectly) working form. So any update on this would be appreciated.

VihasMakwana commented 8 months ago

@elekanne @vdesabou I've merged the PR. I will make a patch release soon.

ludovic-boutros commented 8 months ago

@VihasMakwana, thank you for the merged PR. BTW, this PR just addressed the second issue explained by @rishabhbits038: With splunk.validation.disable, validations are still kicking in, for which change by @ludovic-boutros should help. Its a different issue that the error would be thrown later during runtime. The first one still needs additional work, as explained here.

vdesabou commented 8 months ago

I've tested merged PR with "splunk.validation.disable": "true" and get back to previous behaviour:

curl  -X PUT -H "Content-Type: application/json" --data @/var/folders/kk/sv72bs_s7xn3kz_ggxfmm4yh0000gp/T/pg-XXXXXXXXXX.5ROI4KivWE/connector_new.json http://localhost:8083/connector-plugins/com.splunk.kafka.connect.SplunkSinkConnector/config/validate
{
  "name": "com.splunk.kafka.connect.SplunkSinkConnector",
  "error_count": 1,
  "groups": [
    "Common",
    "Transforms",
    "Predicates",
    "Error Handling"
  ],
  "configs": [
    {

...

    {
      "definition": {
        "name": "splunk.hec.uri",
        "type": "STRING",
        "required": true,
        "default_value": null,
        "importance": "HIGH",
        "documentation": "Splunk HEC URIs. Either a list of FQDNs or IPs of all Splunk indexers, separated with a \",\", or a load balancer. The connector will load balance to indexers using round robin. Splunk Connector will round robin to this list of indexers. https://hec1.splunk.com:8088,https://hec2.splunk.com:8088,https://hec3.splunk.com:8088",
        "group": null,
        "width": "NONE",
        "display_name": "splunk.hec.uri",
        "dependents": [],
        "order": -1
      },
      "value": {
        "name": "splunk.hec.uri",
        "value": null,
        "recommended_values": [],
        "errors": [
          "Missing required configuration \"splunk.hec.uri\" which has no default value."
        ],
        "visible": true
      }
    },
}
elekanne commented 8 months ago

@vdesabou Do we now have a temporary setup that works when validation.disable is set or do we need to wait for the the real issue to be solved. If so @VihasMakwana what is needed and when do you think we can get that?

VihasMakwana commented 8 months ago

@elekanne I think @vdesabou can work with a workaround for now. I can't commit to a specific date but I will raise a PR shortly.

vdesabou commented 8 months ago

I've compiled and tested with "splunk.validation.disable": "true" but the end user would still need to have an official release.

@VihasMakwana are you planning to do a release that includes the current merged PR ?

VihasMakwana commented 8 months ago

@rishabhbits038 @vdesabou I've raised a PR https://github.com/splunk/kafka-connect-splunk/pull/426 to resolve the remaining issue. Please take a look

VihasMakwana commented 8 months ago

@vdesabou

@VihasMakwana are you planning to do a release that includes the current merged PR ?

Yes.

vdesabou commented 8 months ago

@vdesabou

@VihasMakwana are you planning to do a release that includes the current merged PR ?

Yes.

Any idea when the release will be available ?

VihasMakwana commented 8 months ago

@vdesabou maybe by today itself. I merged the other PR as well.

VihasMakwana commented 8 months ago

@vdesabou FYI new release with above PRs https://github.com/splunk/kafka-connect-splunk/releases/tag/v2.2.1

vdesabou commented 8 months ago

Thanks a lot. We are planning to upload it to confluent hub as well

andyfinlay commented 4 months ago

Hello @VihasMakwana I am following up to ask if there will be a permanent solution for this issue without the use of "splunk.validation.disable": "true"? I understand that the property "splunk.validation.disable": "true" will prevent the validation error being thrown and a 500 being wrongfully returned, but wondered if a future version (and sorry if I have missed one) will be released where that will be addressed. Is it fixed in 2.2.2?