strimzi / strimzi-kafka-operator

Apache Kafka® running on Kubernetes
https://strimzi.io/
Apache License 2.0
4.86k stars 1.3k forks source link

Providing a schema registry implementation #29

Closed ppatierno closed 5 years ago

msample commented 6 years ago

This would be appreciated. I tried setting up a Kub Service & Deployment of the Schema Registry using the confluentinc/cp-schema-registry:4.1.1-2 image with bootstrap servers (vs ZK).

I got past complaints about PLAINTEXT not matching broker setup by adding:

currently having problem with repeated versions of this:

[kafka-admin-client-thread | adminclient-1] WARN org.apache.kafka.clients.NetworkClient - [AdminClient clientId=adminclient-1] Connection to node -1 terminated during authentication. This may indicate that authentication failed due to invalid credentials. [kafka-admin-client-thread | adminclient-1] WARN org.apache.kafka.common.network.SslTransportLayer - Failed to send SSL Close message

scholzj commented 6 years ago

As of now we don't use SSL - only PLANTEXT. So I think this is where the problem is coming from. There is a PR for SSL support, it should land in master soon (#487 ), that might help once its merged.

msample commented 6 years ago

Thanks. Seems to work now with the following env vars on the Deployment spec:

      containers:
      - name: my-cluster-schema-registry
        image: confluentinc/cp-schema-registry:4.1.1-2
        ports:
        - containerPort: 8081
        env:
        - name: SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS
          value: PLAINTEXT://my-cluster-kafka:9092
        - name: SCHEMA_REGISTRY_HOST_NAME
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        - name: SCHEMA_REGISTRY_LISTENERS
          value: http://0.0.0.0:8081
        - name: SCHEMA_REGISTRY_KAFKASTORE_SECURITY_PROTOCOL
          value: PLAINTEXT
jslusher commented 5 years ago

This doesn't seem to be working for me. I am also trying to get a schema-registry instance to run in my kafka cluster. I have a connect image from the strimzi source that has the debezium mysql plugin and the confluent avro converter installed. I have a deployment set up for the schema-registry. If I don't specify the SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL env variable it complains about the SCHEMA_REGISTRY_PORT not being set, and if I set the PORT it tells me to use the URL env variable. None of that is the concern of the good people developing the strimzi-kafka-operator.

I saw in this issue that a Network policy is necessary to connect an outside deployment. I added this network policy and after that I was at least getting through to the zookeeper service, but now I see that the tis-sidecar is complaining about an SSL version?

2019.03.15 22:13:51 LOG3[1:140278096041728]: SSL_accept: 1408F10B: error:1408F10B:SSL routines:SSL3_GET_RECORD:wrong version number
2019.03.15 22:13:51 LOG5[1:140278096041728]: Connection reset: 0 byte(s) sent to SSL, 0 byte(s) sent to socket

I'm not sure if that message is a red herring or if there's a setting I'm missing.

jslusher commented 5 years ago

I ended up solving the above by changing the name of my Deployment and Service in kubernetes. Related to this issue: https://github.com/confluentinc/schema-registry/issues/689

ppatierno commented 5 years ago

@jslusher glad to see that now it's working for you. I am going to close this issue, feel free to reopen if you have additional questions/problems.

alexwennerberg commented 5 years ago

I'm encountering the same issue as jslusher, and changing the names did not help. Here is my deployment:

---
apiVersion: v1
kind: Service
metadata:
  name: registry-client
  namespace: kafka
spec:
  ports:
  - port: 8081
  clusterIP: None
  selector:
    app: my-registry
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: registry
  namespace: kafka
spec:
  selector:
    matchLabels:
      app: "my-registry"
  replicas: 1
  template:
    metadata:
      labels:
        app: my-registry
    spec:
      terminationGracePeriodSeconds: 10
      containers:
        - name: cp-registry-container
          image: confluentinc/cp-schema-registry 
          env:  # see https://docs.confluent.io/current/schema-registry/installation/config.html
            - name: SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL
              value: my-cluster-zookeeper-client:2181
            - name: SCHEMA_REGISTRY_HOST_NAME
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
            - name: SCHEMA_REGISTRY_AVRO_COMPATIBILITY_LEVEL
              value: BACKWARD 
          ports:
            - containerPort: 8081 

The schema_registry pod is receiving the following error:

java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:192) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145)

And the zookeeper pod is receiving this error:

SSL_accept: 1408F10B: error:1408F10B:SSL routines:SSL3_GET_RECORD:wrong version number

I was looking through other issues and noticed SSL was recently added to strimzi, so I wasn't sure whether I need to use the - name: SCHEMA_REGISTRY_KAFKASTORE_SECURITY_PROTOCOL value: PLAINTEXT variable, as mentioned above. I tried this anyway and it didn't seem to resolve the issue.

Thank you!

scholzj commented 5 years ago

I have no experience with Confluent Schema Registry. But you point the variable SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL to Zookeeper. Is that correct? Or should it point to Kafka? I think that in the initial question in this issue it was pointing to Kafka.

In general, our Zookeeper is locked down for security reasons using TLS. If you need to access it directly, you should be able to use this (on your own risk).

alexwennerberg commented 5 years ago

Thanks, I've been working through this issue, still haven't found a solution, but I noticed that on the project board, there is a todo item for this feature, do you plan on implementing this in the foreseeable future?

Thanks!

scholzj commented 5 years ago

We do plan it, but I'm afraid I do not have any exact timeline.

alexwennerberg commented 5 years ago

Changing the configuration to these settings fixed my issues: https://github.com/strimzi/strimzi-kafka-operator/issues/29#issuecomment-399148274

marcospassos commented 5 years ago

@scholzj any way to sponsor adding support for Kafka registry?

scholzj commented 5 years ago

I do not think there is any other way then by opening PRs. We are looking into the Schema Registry, but it is not trivial also given we cannot use the Confluent one due to the license.

That said, I think some people got the Confluent chema registry working as any other client connecting to Kafka. IIRC the setup is not completely straight forward, but it can be configure to not use Zookeeper and connect to Kafka only. (In which case, if you deploy it as your own application, the license is really only about your use cases. Where we as a project want tosmething what everyone can use.)

marcospassos commented 5 years ago

@scholzj thank you for replying.

I do not think there is any other way then by opening PRs. We are looking into the Schema Registry, but it is not trivial also given we cannot use the Confluent one due to the license

Unfortunately, my DevOps skills are limited, that's why I asked if there is any way of supporting it.

we cannot use the Confluent one due to the license.

What's the issue with the Confluent's Community License?

Under the Confluent Community License, you can access the source code and modify or redistribute it; there is only one thing you cannot do, and that is use it to make a competing SaaS offering. Here is the exact language:

“Excluded Purpose” is making available any software-as a-service, platform-as-a-service, infrastructure-as-a-service or other similar online service that competes with Confluent products or services that provide the Software.

For example, it does not allow hosting of Confluent KSQL, Confluent Schema Registry, Confluent REST Proxy, or other software licensed under the Confluent Community License as online service offerings that compete with Confluent SaaS products or services that provide the same software. If you are not doing what is excluded, this license change will not affect you.

I'm not a lawyer but, once strimzi is an open-source initiative, I can't see how it violates the license.

scholzj commented 5 years ago

My view which might not be necessarily the view of other maintainers:

marcospassos commented 5 years ago

I completely understand your point. However, using a schema registry is almost mandatory when using Avro, so maintaining it under the Strimzi stack is a significant benefit from a dev point of view.

scholzj commented 5 years ago

I agree. But we need to find solution whcih works for everyone.

alexwennerberg commented 5 years ago

Is there any open source alternative to Schema Registry with a more permissive license, or anything currently in development?

marcospassos commented 5 years ago

Confluent’s schema registry is the most popular implementation and in active development. The other option is https://github.com/schema-repo/schema-repo, but it looks abandoned.

marcospassos commented 5 years ago

@scholzj and @alexwennerberg just found this: https://github.com/hortonworks/registry

duanshiqiang commented 4 years ago

Redhat developed a confluent schema registry compatible schema registry, maybe it can be integrated?

dklesev commented 4 years ago

Could someone provide an example how to run cp-schema-registry-server with strimzi? Especially if authentication is plain and the connection part to zookeeper. @scholzj mentioned this here, is this the way to go for the zookeeper part?

scholzj commented 4 years ago

@duanshiqiang The Apicurio Schema Registry should work without any problems with Strimzi. I would love to have some blog post or a better demo about how to integrate it. But I didn't had yet time for it. If anyone with more time would be interested in doing it, we would be more than happy to publish it of course.

However, I do not think we plan at this point any deeper integration into the operator. The Schema registry is more than happy to run as a separate deployment - it is basically just another application connecting to Kafka as a client. So I do not think there is much value in just creating the deployment by the operator. Do you have any ideas how could it be more integrated into the operator htan just creating the deployment?


@dklesev As for the Confluent Registry, TBH I'm not sure how much it needs the Zookeeper access - I think someone once told me that you can really configure it without the Zookeeper. But I personally haven't done it, so I'm not sure. Our Zookeeper is by default locked down using the TLS sidecars with TLS Client Authentication to make sure it is secure. The Gist you linked will open it to anyone without any authentication. That should make it possible to use it from any other applications, but it also means increased security risk and any badly written application using the Zookeeper can affect your Kafka cluster. So use at your own risk.

dklesev commented 4 years ago

@scholzj what do you think about this, some parts are hardcoded, however. If I find time I will look into this and integration with strimzi. I think its quite important to have the schema registry (cp/rh) being easy to setup with strimzi as its a common component used with kafka.

scholzj commented 4 years ago

TBH, I actually haven't seen this before. So I will need to have a bit closer look.

cdmikechen commented 3 years ago

I've open another java project to support schema registry operator https://github.com/shangyuantech/strimzi-registry-ksql-operator . Now it only provide some basic non SSL functions. I would like to integrate the two services, and then adapt to the current Kafka Operator.

iceman91176 commented 3 years ago

@scholzj Actually apicurio works pretty well with Strimzi. As you said it is just another kafka-client. Out of the box it does not support Oauth2, so i added that functionality. Basically it just means adding some dependencies.

I put together a docker-build script that does this. https://github.com/iceman91176/witcom-apicurio-registry This is pretty much undocumented right now ;-)

I also have a helm-chart available that deploys the registry and configures oauth2. This one is not online yet, if anyone is interested i will share it.

Currently apicurio does not support any authentication/authorization. They will have it in Version 2.0 - based on keycloak. Until then there is a "security-gateway" available that provides role-based access to the schema-registry.

https://github.com/witcom-gmbh/apicurio-security-gateway

This (plus securing the Registry-UI) is also integrated in the helm-chart i mentioned above. I should be able to provide that in the next days. I also have a avro-producer/consumer demo available that uses apicurio natively.

dicolasi commented 3 years ago

@iceman91176 please do share you helm chart

kmandal-volvo commented 3 years ago

@scholzj Actually apicurio works pretty well with Strimzi. As you said it is just another kafka-client. Out of the box it does not support Oauth2, so i added that functionality. Basically it just means adding some dependencies.

I put together a docker-build script that does this. https://github.com/iceman91176/witcom-apicurio-registry This is pretty much undocumented right now ;-)

I also have a helm-chart available that deploys the registry and configures oauth2. This one is not online yet, if anyone is interested i will share it.

Currently apicurio does not support any authentication/authorization. They will have it in Version 2.0 - based on keycloak. Until then there is a "security-gateway" available that provides role-based access to the schema-registry.

https://github.com/witcom-gmbh/apicurio-security-gateway

This (plus securing the Registry-UI) is also integrated in the helm-chart i mentioned above. I should be able to provide that in the next days. I also have a avro-producer/consumer demo available that uses apicurio natively.

Please share the helm-chart I will deploy in our Strimzi Kafka cluster.

eshepelyuk commented 3 years ago

If someone still interested, i have my supporeted helm chart for apicurio and apicurio content sync.

https://github.com/eshepelyuk/apicurio-registry-helm

Not tried with Strimzi, but feel free to open issues and pull requests.

mrodal commented 1 year ago

I think this issue should be reopened

scholzj commented 1 year ago

I don't think we are planning anything around service registry. Sorry.

hongbo-miao commented 1 year ago

I can understand why Strimzi does not have its own Registry now. Because there are many popular solutions there especially Apicurio Registry and Confluent Schema Registry. And they are working with Kafka cluster created by Strimzi directly.

I listed all info I found at here https://github.com/Hongbo-Miao/hongbomiao.com/issues/8348

And I have succeed both Apicurio Registry and Confluent Schema Registry in a Kafka cluster created by Strimzi. Hopefully that ticket helps more people. 😃