Open ryan-zheng-teki opened 4 years ago
(1) In Kafka, Once the messages are sent to Kafka, we have to use either retention or compaction to remove the data. Retention is to configure the time-to-live in the kafka. Compaction is about merging data with the same id into one. In our case, we want to index the updated Question to Jina Service in real time. (1)Can we reindex when the indexing failed? kafka consumer keeps track of the offset. Then it is able to pick up the failed one and do it again. (2)The problem comes how do we remove the messages after indexing? We are not able to remove a certain message by the consumer. Because the consumer does not do that. (3)We can use compaction for messages. So that question with the same id can be compacted.
User Story When user asks questions, we will persist the question to MongoDB. But JinaAI needs to index this question&answers, so that the question is searchable. What we will do is to connect MongoDB with Kafka. The question and answers will be streamed to Kafka. And Jina Service will pull data from Kafka in real-time to index the question and answers.
How To Do Create QiuSuo-EventCenter project. Add Kakfa into our docker compose file