Closed jakubgs closed 7 months ago
The gbd60bf
commit doesn't appear to exist in the repo: https://github.com/waku-org/nwaku/commit/gbd60bf
This fleet currently does not use the layout where a database is shared: https://github.com/status-im/infra-shards/blob/ae8ff33aa207241cdf57a3029e1ff022ae6548b8/ansible/group_vars/store.yml#L40
If we looks at ElasticSearch logs for shards.test
fleet we can see the issue started on 2023.12.11
:
> ./elasticsearch/esclean.py -i 'logstash-2023.12.*' -f shards.test -p 'docker/nim-waku-store-db' -m 'duplicate key value'
Cluster: status-logs-search
logstash-2023.12.01 count: 0
logstash-2023.12.02 count: 0
logstash-2023.12.03 count: 0
logstash-2023.12.04 count: 0
logstash-2023.12.05 count: 0
logstash-2023.12.06 count: 0
logstash-2023.12.07 count: 0
logstash-2023.12.08 count: 0
logstash-2023.12.09 count: 0
logstash-2023.12.10 count: 0
logstash-2023.12.11 count: 1427
logstash-2023.12.12 count: 5191
logstash-2023.12.13 count: 8664
logstash-2023.12.14 count: 9390
logstash-2023.12.15 count: 10977
logstash-2023.12.16 count: 8047
logstash-2023.12.17 count: 3624
logstash-2023.12.18 count: 27613
logstash-2023.12.19 count: 33225
logstash-2023.12.20 count: 34919
logstash-2023.12.21 count: 82543
logstash-2023.12.22 count: 73998
logstash-2023.12.23 count: 6123
logstash-2023.12.24 count: 8733
logstash-2023.12.25 count: 8880
logstash-2023.12.26 count: 41259
logstash-2023.12.27 count: 1229433
logstash-2023.12.28 count: 1329964
logstash-2023.12.29 count: 69862
logstash-2023.12.30 count: 9576
logstash-2023.12.31 count: 12972
And reached a peak on 2023.12.28
.
Oh, sorry, my mistake, the DB is actually used by two nodes per DC:
store-01.do-ams3.shards.test
store-02.do-ams3.shards.test
But that doesn't explain why the warnings started showing up on 2023.12.11
.
We could avoid these by first checking if a given message is already in the table: https://stackoverflow.com/a/12388511
But in theory that could have a performance impact, but not sure, since the DB still has to check the uniqueness of the record anyway, even if you just insert.
For now I have decided to filter out these logs on Logstash level to avoid unnecessary ElasticSearch spam:
infra-hq#153d0726
- logstash; filter out waku duplicate key value logsAnd opened an issue to possibly query before inserting:
Also took a look at waku node logs and found out most come from two files: pubsub.nim
and pubsubpeer.nim
Unrelated but interesting.
There is one thing we could use, not sure what the performance impact would be:
INSERT ... ON CONFLICT DO NOTHING;
The ON CONFLICT
clause could prevent these warnings:
https://www.postgresql.org/docs/current/sql-insert.html#SQL-ON-CONFLICT
Opened an issue:
No idea if it would work.
These logs have started appearing in database logs on
shards.test
nodes:And the numbers are worrying:
Node version used: