Rewrite client to support high volume of topics

oleksiyk / kafka

Apache Kafka 0.9 client for Node

MIT License

297 stars 85 forks source link

Rewrite client to support high volume of topics #237

Closed mnorkin closed 5 years ago

mnorkin commented 5 years ago

Implement high performant metadata read from kafka cluster.

Previous implementation looped over all the topics, which it received from kafka cluster. Current implementation fetches only metadata by topic, so no loop over all topics is required.

Rewrote couple of tests to reduce dependencies between tests
Added more test to prove single instance multi topic publish/ consume/ group consume
Wrote a kafka testkit to dynamically create topics. Now there is no need to download, run kafka and create topics for tests to work - just npm run test
No API changes to the client

mnorkin commented 5 years ago

New PR, because old one was merged by mistake https://github.com/oleksiyk/kafka/pull/234

mnorkin commented 5 years ago

https://github.com/oleksiyk/kafka/pull/234#issuecomment-441986061

And can you please revert the ability to run tests locally, without Doker.

@oleksiyk this means what tests will be not isolated from each other and for failed tests will have influence for not failing tests.

oleksiyk commented 5 years ago

and for failed tests will have influence for not failing tests.

Sorry, I'm not sure I understand, what do you mean?

mnorkin commented 5 years ago

Sorry, I'm not sure I understand, what do you mean?

I can give you an example. Please correct me if I'm wrong.

Test A and B is in the same group. Both of each publishes one message.

Test run 1: Test A publishes one message to consumer and asserts that consumer handler was called only once. Test passed. Test B publishes one message to the consumer and fails.

Test run 2: Test A publishes one message to consumer and asserts that consumer handler was called only once. Test failed, because consumer in test A also received a message from failed test B.

In dockerized kafka testkit, testkit spins a new instance of kafka every time, so every test run starts from clean offset. Also, docker gives you nice ability to dynamically create topics. Now, every test group creates the topics it needs. No need to create any topics by hand. Which is nice.

Can you please explain why you don't want dockerized kafka? Tests do run locally, on development machine.

oleksiyk commented 5 years ago

Test run 1: Test A publishes one message to consumer and asserts that consumer handler was called only once. Test passed. Test B publishes one message to the consumer and fails. Test run 2: Test A publishes one message to consumer and asserts that consumer handler was called only once. Test failed, because consumer in test A also received a message from failed test B.

Well I don't see any problem here. Consumers in all such tests are initialised to the end offset of the topic on each startup.

Can you please explain why you don't want dockerized kafka? Tests do run locally, on development machine.

Simple. I don't have enough power to spawn docker/kafka instances for each test. My mbp is already quite busy.

mnorkin commented 5 years ago

Simple. I don't have enough power to spawn docker/kafka instances for each test. My mbp is already quite busy.

It's for every test run, not for every test.

oleksiyk commented 5 years ago

Anyway. Thats my average runtime with current project, docker doesn't fit here:

Also, this change is not within PR topic.

mnorkin commented 5 years ago

Ok, I'm closing this, bye