manticoresoftware / manticoresearch-buddy

Manticore Buddy is a Manticore Search's sidecar which helps it with various tasks
GNU General Public License v3.0
20 stars 2 forks source link

Buddy loose records in log (debugv mode) #342

Open djklim87 opened 2 months ago

djklim87 commented 2 months ago

Bug Description:

#!/usr/bin/env bash

set -x

docker stop kafka && docker rm kafka
docker stop manticore && docker rm manticore

wget -O /tmp/import.sh https://raw.githubusercontent.com/manticoresoftware/manticoresearch-buddy/main/test/Kafka/import.sh
wget -O /tmp/dump.json https://raw.githubusercontent.com/manticoresoftware/manticoresearch-buddy/main/test/Kafka/dump.json

docker network create app-network --driver bridge
docker run -it -e EXTRA=1 --network=app-network \
--platform linux/amd64 \
--name manticore -d ghcr.io/manticoresoftware/manticoresearch:test-kit-latest bash
docker exec manticore sed -i '/data_dir = \/var\/lib\/manticore/a\    buddy_path = manticore-executor -n /usr/share/manticore/modules/manticore-buddy/src/main.php --debugv\n' /etc/manticoresearch/manticore.conf
docker run -it -d --network=app-network \
--name kafka \
-v /tmp/import.sh:/import.sh \
-v /tmp/dump.json:/tmp/dump.json \
-e KAFKA_CFG_NODE_ID=0 \
-e KAFKA_CFG_PROCESS_ROLES=controller,broker \
-e KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=0@kafka:9093 \
-e KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093 \
-e KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://:9092 \
-e KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT \
-e KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER \
-e KAFKA_CFG_INTER_BROKER_LISTENER_NAME=PLAINTEXT bitnami/kafka:3.7.0
docker exec kafka /opt/bitnami/kafka/bin/kafka-topics.sh --create --topic my-data --partitions 4 --bootstrap-server localhost:9092
docker exec manticore searchd
#Wait until buddy started
sleep 10
docker exec manticore mysql -h0 -P9306 -e "CREATE SOURCE kafka_stop (id bigint, term text, abbrev text, GlossDef json, location json, metadata json, tags json, timestamp_unix timestamp) type='kafka' broker_list='kafka:9092' topic_list='my-data' consumer_group='manticore_combined' num_consumers='1' batch=50;CREATE TABLE destination_kafka_stop (id bigint, name text, short_name text, received_at text, size multi, lat float, lon float, views int, info text, tags json, timestamp_combined timestamp);CREATE MATERIALIZED VIEW view_table_stop TO destination_kafka_stop AS SELECT id, term AS name, abbrev AS short_name, UTC_TIMESTAMP() AS received_at, GlossDef.size AS size, location.lat AS lat, location.lon AS lon, metadata.views AS views, metadata.info AS info, tags, timestamp_unix AS timestamp_combined FROM kafka_stop;"
docker exec kafka chmod +x ./import.sh
docker exec kafka ./import.sh
timeout 60 bash -c 'docker exec manticore bash -c "tail -f /var/log/manticore/searchd.log" | grep -m1 "REPLACE+INTO+destination_kafka_stop" > /dev/null && echo "Download completed."' || echo "Download failed."
docker exec manticore mysql -h0 -P9306 -e "SELECT COUNT(*) FROM destination_kafka_stop;"

# ---------------------- Should be one row ---------------- #
docker exec manticore cat /var/log/manticore/searchd.log | grep "REPLACE+INTO+destination_kafka_stop"

sleep 1
docker exec manticore searchd --stopwait
docker exec manticore searchd
timeout 60 bash -c 'docker exec manticore bash -c "tail -f /var/log/manticore/searchd.log" | grep -m2 "REPLACE+INTO+destination_kafka_stop" > /dev/null && echo "Download completed."' || echo "Download failed."
docker exec manticore mysql -h0 -P9306 -e "SELECT COUNT(*) FROM destination_kafka_stop;"

# ---------------------- Should be two rows ---------------- #
docker exec manticore cat /var/log/manticore/searchd.log | grep "REPLACE+INTO+destination_kafka_stop"

Sometimes after the last command we got 1 record, sometimes 2

Should be like this

[Tue Aug 20 11:48:47.395 2024] [34] [BUDDY] [3997µs] manticore request: query=REPLACE+INTO+destination_kafka_stop+%28id%2C+name%2C+short_name%2C+received_at%2C+size%2C+lat%2C+lon%2C+views%2C+info%2C+tags%2C+timestamp_combined%29+VALUES+%281%2C%27The+Quick+Brown+Fox%27%2C%27ISO+12345%3A6789%27%2C%272024-08-20+11%3A48%3A47%27%2C%2820%2C25%2C30%29%2C0%2C0%2C0%2C%27%27%2C%27%27%2C0%29%2C%282%2C%27Jumped+Over+The+Lazy+Dog%27%2C%27ISO+98765%3A4321%27%2C%272024-08-20+11%3A48%3A47%27%2C%2820%2C25%2C30%29%2C0%2C0%2C0%2C%27%27%2C%27%27%2C0%29%2C%283%2C%27A+Stitch+In+Time+Saves+Nine%27%2C%27ISO+24680%3A1357%27%2C%272024-08-20+11%3A48%3A47%27%2C%2820%2C25%2C30%29%2C0%2C0%2C0%2C%27%27%2C%27%27%2C0%29%2C%284%2C%27An+Apple+A+Day+Keeps+The+Doctor+Away%27%2C%27ISO+97531%3A86420%27%2C%272024-08-20+11%3A48%3A47%27%2C%2820%2C25%2C30%29%2C0%2C0%2C0%2C%27%27%2C%27%27%2C0%29%2C%285%2C%27Actions+Speak+Louder+Than+Words%27%2C%27ISO+56789%3A12345%27%2C%272024-08-20+11%3A48%3A47%27%2C%2820%2C25%2C30%29%2C0%2C0%2C0%2C%2
[Tue Aug 20 11:49:03.356 2024] [97] [BUDDY] [3735µs] manticore request: query=REPLACE+INTO+destination_kafka_stop+%28id%2C+name%2C+short_name%2C+received_at%2C+size%2C+lat%2C+lon%2C+views%2C+info%2C+tags%2C+timestamp_combined%29+VALUES+%2851%2C%27Cutting+Corners%27%2C%27ISO+97531%3A24680%27%2C%272024-08-20+11%3A49%3A03%27%2C%2820%2C25%2C30%29%2C0%2C0%2C0%2C%27%27%2C%27%27%2C0%29%2C%2852%2C%27Barking+Up+The+Wrong+Tree%27%2C%27ISO+56789%3A86420%27%2C%272024-08-20+11%3A49%3A03%27%2C%2820%2C25%2C30%29%2C0%2C0%2C0%2C%27%27%2C%27%27%2C0%29%2C%2853%2C%27Rome+Wasn%5C%27t+Built+In+A+Day%27%2C%27ISO+86420%3A97531%27%2C%272024-08-20+11%3A49%3A03%27%2C%2820%2C25%2C30%29%2C0%2C0%2C0%2C%27%27%2C%27%27%2C0%29%2C%2854%2C%27Don%5C%27t+Put+All+Your+Eggs+In+One+Basket%27%2C%27ISO+97531%3A86420%27%2C%272024-08-20+11%3A49%3A03%27%2C%2820%2C25%2C30%29%2C0%2C0%2C0%2C%27%27%2C%27%27%2C0%29%2C%2855%2C%27A+Bird+In+The+Hand+Is+Worth+Two+In+The+Bush%27%2C%27ISO+24680%3A1357%27%2C%272024-08-20+11%3A49%3A03%27%2C%2820%2C2

Sometimes only 1 query

[Tue Aug 20 11:44:19.701 2024] [28] [BUDDY] [3894µs] manticore request: query=REPLACE+INTO+destination_kafka_stop+%28id%2C+name%2C+short_name%2C+received_at%2C+size%2C+lat%2C+lon%2C+views%2C+info%2C+tags%2C+timestamp_combined%29+VALUES+%281%2C%27The+Quick+Brown+Fox%27%2C%27ISO+12345%3A6789%27%2C%272024-08-20+11%3A44%3A19%27%2C%2820%2C25%2C30%29%2C0%2C0%2C0%2C%27%27%2C%27%27%2C0%29%2C%282%2C%27Jumped+Over+The+Lazy+Dog%27%2C%27ISO+98765%3A4321%27%2C%272024-08-20+11%3A44%3A19%27%2C%2820%2C25%2C30%29%2C0%2C0%2C0%2C%27%27%2C%27%27%2C0%29%2C%283%2C%27A+Stitch+In+Time+Saves+Nine%27%2C%27ISO+24680%3A1357%27%2C%272024-08-20+11%3A44%3A19%27%2C%2820%2C25%2C30%29%2C0%2C0%2C0%2C%27%27%2C%27%27%2C0%29%2C%284%2C%27An+Apple+A+Day+Keeps+The+Doctor+Away%27%2C%27ISO+97531%3A86420%27%2C%272024-08-20+11%3A44%3A19%27%2C%2820%2C25%2C30%29%2C0%2C0%2C0%2C%27%27%2C%27%27%2C0%29%2C%285%2C%27Actions+Speak+Louder+Than+Words%27%2C%27ISO+56789%3A12345%27%2C%272024-08-20+11%3A44%3A19%27%2C%2820%2C25%2C30%29%2C0%2C0%2C0%2C%2

But when we check the records count in the table we see that the request was performed

+----------+
| count(*) |
+----------+
|       57 |
+----------+

Seems like Buddy or searchd doesn't write all records to log.

Manticore Search Version:

Manticore 6.3.7 9c915a9a6@24081909 dev (columnar 2.3.1 eb62283@24072718) (secondary 2.3.1 eb62283@24072718) (knn 2.3.1 eb62283@24072718)

Operating System Version:

Docker

Have you tried the latest development version?

Yes

Internal Checklist:

To be completed by the assignee. Check off tasks that have been completed or are not applicable.

- [ ] Implementation completed - [ ] Tests developed - [ ] Documentation updated - [ ] Documentation reviewed - [ ] Changelog updated
sanikolaev commented 2 months ago

As discussed, let's try to localize the issue by file_put_contents()ing to a temp file before echoing on the buddy's side.