Closed andbuitra closed 3 years ago
@andbuitra Thanks for reporting, can you share config and version? I'll try to reproduce. Sounds like something is missing.
The configuration is pretty straightforward
[Unit]
Description=DNSBL Exporter
StartLimitBurst=5
[Service]
User=root
ExecStart=/root/prometheus-monitoring/dnsbl_exporter/dnsbl_exporter --config.dns-resolver [REDACTED] --config.rbls /root/prometheus-monitoring/config-files/dnsbl_exporter/rbls.ini --config.targets /root/prometheus-monitoring/config-files/dnsbl_exporter/targets.ini
Restart=on-failure
RestartSec=5s
[Install]
WantedBy=default.target
The version used is the latest release
./dnsbl_exporter --version
dnsbl-exporter version 0.4.3
@andbuitra Sorry, I meant rbls.ini
and possibly targets.ini
. I am assuming something is missing, and I don't handle input correctly.
Hello
The rbls.ini is as follows
[rbl]
server=cbl.abuseat.org
server=bl.deadbeef.com
server=spamtrap.drbl.drand.net
server=spamsources.fabel.dk
server=0spam.fusionzero.com
server=mail-abuse.blacklist.jippg.org
server=dyna.spamrats.com
server=noptr.spamrats.com
server=spam.spamrats.com
server=dnsbl.sorbs.net
server=spam.dnsbl.sorbs.net
server=bl.spamcop.net
server=pbl.spamhaus.org
server=sbl.spamhaus.org
server=xbl.spamhaus.org
server=ubl.unsubscore.com
server=dnsbl-1.uceprotect.net
server=dnsbl-2.uceprotect.net
server=dnsbl-3.uceprotect.net
server=db.wpbl.info
server=access.redhawk.org
server=sbl-xbl.spamhaus.org
server=b.barracudacentral.org
server=dul.dnsbl.sorbs.net
server=http.dnsbl.sorbs.net
server=l1.spews.dnsbl.sorbs.net
server=l2.spews.dnsbl.sorbs.net
server=misc.dnsbl.sorbs.net
server=postmaster.rfc-ignorant.org
server=rbl.spamlab.com
server=rbl.suresupport.com
server=relays.bl.kunden.de
server=smtp.dnsbl.sorbs.net
server=socks.dnsbl.sorbs.net
server=zen.spamhaus.org
server=zombie.dnsbl.sorbs.net
server=truncate.gbudb.net
Targets follows this pattern
[targets]
server=smtp.example1.com
server=smtp.example2.com
@andbuitra I'll check it on the weekend 🙏🏼
@andbuitra I haven't made much progress. Can you add --log.debug
to your systemd unit and see if it uncovers anything? It's a bit noisy, but it would help.
My guess is that it's something inside the RBL requesting and response parsing. Or maybe even in a dependency.
@andbuitra release is here: https://github.com/Luzilla/dnsbl_exporter/releases/tag/0.4.4
@till I completely forgot about this. I will test it on the next couple of days. Thank you!
Yeah, let me know how it goes. I think I'll wait a bit until I merge the updated dependency again. Trying to think what else can be done to track this.
Btw, if you happen to narrow it down to a host/RBL combo, I can write a test confirming it against the upstream dependency and see about fixing it there.
@andbuitra friendly ping. Did you have a chance to take a look?
@andbuitra Do you see this happening still? I am currently prepping for a 0.5.0
release.
Btw, I'd like to include service files. Do you feel like contributing your's? With location a la man here
would be preferred.
@till Apologies, I was on vacation. I haven't been able to test the package yet but I will now. My systemd unit is simple and it loads the config file from a local git repo; the unit is located at /etc/systemd/system/dnsbl_exporter.service but I have seen other apps like MariaDB putting them on /usr/lib/... and then referencing them. Maybe there's a standard for it by the freedesktop. The restart clause was put to mitigate the original issue.
I will test the release 0.4.4 and let you know if the issue happens again.
Here is a 0.4.4-next: dnsbl-exporter-linux-amd64-0.4.4-next.zip
If you want to build it yourself, you'll need goreleaser and a clone of this repo: make build
.
I kinda just spotted something else.
Sometimes parsing IPs seems to fail. Why, not sure, but if it's nil
. Code panics.
So, I can't figure out why this may happen to begin with, but now I should not panic but instead give you a log message about the "string" which it can't determine if it's an IP(v4 or v6).
Latest main
branch:
dnsbl_exporter_0.4.4-next_Darwin_arm64.tar.gz dnsbl_exporter_0.4.4-next_Darwin_x86_64.tar.gz
I think this contains an actual fix. So it was not in a dependency, but my use of Go. If I don't hear back from you, I'll release 0.5.0
towards the weekend.
@till I was deploying this but I see this is the Darwin binary and we run Linux on the server. Could you build that latest version for linux x86? I will remove the restarts on my systemd unit so it won't fix itself automatically.
I have installed it now and so far so good. I will report back if the issue shows up again
@till No crashes as of now. I believe that panic was causing the binary to stop. It's been working normally for more than 12 hours without needing to reboot
@andbuitra Thanks for letting me know.
You catch anything in the logs? I am curious what kind of "ip" caused this.
Nothing special shows up. The only error is "level=error msg="read udp 127.0.0.1:37474->:0: read: connection refused" that shows up multiple times every minute but don't really know what it's about since the monitor works fine (as in metrics show up correctly)
Maybe you filter udp? DNS uses both (tcp and udp). If you have a local resolver it should respond to both.
It could be being filtered by upstream firewall. The resolver is in a public network and it's used throughout the infrastructure. dnsbl operates on a server behind a firewall using it as a gateway so that could be the reason. However, it's a non issue since the exporter is working just fine.
The exporter has been running for more than three days now with no issues. I believe this issue can be closed
Ok, good to know! I'll close when I cut a release. I am trying to finish #84 first! :) Thanks again for your time and patience.
@andbuitra I finally released 0.5.0
, thanks again for your help and patience. I put your unit into #86. If you have time to contrib a more general unit file, let me know.
Hello,
We deployed dnsbl_exporter on a CentOS 7 machine as a systemd service. It's currently going offline pretty often complaining about memory (either oom or sigsegv). This is the error:
There's plenty of memory available (more than 6 GB) so this shouldn't be an issue. So far I've resorted to configure auto restart for the systemd unit. If relevant, the log also shows plenty of these:
There's nothing too special about our config. The only thing is that we load the RBLs and targets (using the proper args with absolute paths) from a folder that is linked to a git repo.