nsqio / nsq

A realtime distributed messaging platform
https://nsq.io
MIT License
24.72k stars 2.89k forks source link

NSQd diskqueue not recovered on restart #1459

Closed TheArKaID closed 10 months ago

TheArKaID commented 10 months ago

Why my nsqd didn't recover any message that still in progressing after nsqd restarted? I run nsqd in Ubuntu, with something like,

nsqd --lookupd-tcp-address=127.0.0.1:4160 --broadcast-address=192.168.190.22

When the nsqd restarted, previous message is gone. I realized there's something like diskqueue dat files, which contains message that I think was removed.

The question is, why my nsqd is not recover messages automatically? Or how to do it? I can't find any information in the docs, or should it recover automatically with my nsqd command above?

image

nsq v1.2.1 Ubuntu 20.04.3

jehiah commented 10 months ago

NSQ buffers messages to those diskqueues but they are append only and not truncated until it rolls over to the next file.

That means the presence of a file doesn't indicate a) that there were messages pending to be delivered at the time nsqd was stopped, or b) that any pending messages at startup were not delivered.

TheArKaID commented 10 months ago

Hmm...now I got file.bad somehow. Maybe I misunderstood how it works but, 1) If any pending message in a topic, and suddenly nsqd is stopped, will it recovered on re-start? 2) For every message that has been sent, will it lost it's statistic (like requeued, time-out or messages)? 3) I've read some issue, those stated that running nsq with nsqd & will stopped the nsqd when user log out after some times, and I need to use process manager or nohup to make it ignores SIGHUP signal. Is there any official way to run nsqd in this case? I've tried Docker for nsq but there's a reason in my side to not use it.

Thanks.

jehiah commented 10 months ago

If any pending message in a topic, and suddenly nsqd is stopped, will it recovered on re-start? Yes assuming nsqd is stopped gracefully. Otherwise see https://nsq.io/overview/features_and_guarantees.html#guarantees

For every message that has been sent, will it lost it's statistic (like requeued, time-out or messages)?

Stats are runtime telemetry, they are not persistent. If you want persistent statistics see the statsd integration https://nsq.io/components/nsqd.html#get-stats https://nsq.io/components/nsqadmin.html#statsd--graphite-integration

running nsq with nsqd &

If you want a long running services, yes you should use a service manager (systemd, daemontools, etc)

TheArKaID commented 10 months ago

Thanks for the straight to the point answer.