Closed netstx closed 2 years ago
Hi @netstx,
This is definitely not an expected behavior:) According to the logs you posted, your BGPalerter process dies every 10 minutes and systemd restarts it. It dies silently, with no error logs. This is really weird.
Basic checks:
Please share your config file (be careful to remove passwords etc).
In any case, debug steps:
1) please, install this version (it's a pre-release). You just need to replace the binary file.
2) edit your config.yml
and be sure to have a line enableAdvancedRpkiStats: false
here. This is not a solution, it's just to understand if the silent crash is somehow related to this feature. If yes, I will need to understand why.
Thanks.
Thanks @massimocandela!
Below is some more additional info, on my 2nd instance (same exact installation method, OS and issue described above, but using it for testing this problem):
systemctl status bgpalerter - note process memory usage at about 10m uptime, host has 1GB ram
● bgpalerter.service - BGPalerter
Loaded: loaded (/etc/systemd/system/bgpalerter.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2022-08-16 05:38:29 CDT; 10min ago
Main PID: 153278 (bgpalerter-linu)
Tasks: 10 (limit: 1132)
Memory: 557.9M
CPU: 22.352s
CGroup: /system.slice/bgpalerter.service
└─153278 /home/bgpalerter/bgpalerter-linux-x64
Aug 16 05:38:29 vultr systemd[1]: Started BGPalerter.
Aug 16 05:38:29 vultr bgpalerter-linux-x64[153278]: Loaded config: /home/bgpalerter/config.yml
Aug 16 05:38:29 vultr bgpalerter-linux-x64[153278]: BGPalerter, version: 1.30.0 environment: production
Aug 16 05:38:41 vultr bgpalerter-linux-x64[153278]: Monitoring 192.154.3.0/24
Aug 16 05:38:41 vultr bgpalerter-linux-x64[153278]: Monitoring 2a06:a005:ae0::/44
Aug 16 05:38:41 vultr bgpalerter-linux-x64[153278]: Monitoring 2a06:1180:1::/48
Aug 16 05:38:41 vultr bgpalerter-linux-x64[153278]: Monitoring 2a06:c3c0:6::/48
Aug 16 05:38:41 vultr bgpalerter-linux-x64[153278]: Monitoring 2a06:a005:2d8::/48
Aug 16 05:38:41 vultr bgpalerter-linux-x64[153278]: Monitoring 2604:2980:f603::/48
Aug 16 05:38:41 vultr bgpalerter-linux-x64[153278]: Monitoring AS394414
directory and files
bgpalerter@vultr:~$ pwd
/home/bgpalerter
bgpalerter@vultr:~$ ls -lhas
total 62M
4.0K drwxr-xr-x 6 bgpalerter bgpalerter 4.0K Aug 15 18:31 .
4.0K drwxr-xr-x 3 root root 4.0K Aug 14 21:36 ..
4.0K -rw------- 1 bgpalerter bgpalerter 1.9K Aug 15 18:30 .bash_history
4.0K -rw-r--r-- 1 bgpalerter bgpalerter 220 Aug 14 21:36 .bash_logout
4.0K -rw-r--r-- 1 bgpalerter bgpalerter 3.5K Aug 14 21:36 .bashrc
62M -rwxr-xr-x 1 bgpalerter bgpalerter 62M Aug 9 04:38 bgpalerter-linux-x64
4.0K drwxr-xr-x 2 bgpalerter bgpalerter 4.0K Aug 15 18:35 .cache
8.0K -rw-r--r-- 1 bgpalerter bgpalerter 6.0K Aug 15 18:30 config.yml
4.0K drwxr-xr-x 3 bgpalerter bgpalerter 4.0K Aug 14 21:38 .local
4.0K drwxr-xr-x 2 bgpalerter bgpalerter 4.0K Aug 16 05:49 logs
4.0K -rw-r--r-- 1 bgpalerter bgpalerter 1.2K Aug 15 15:32 prefixes.yml
4.0K -rw-r--r-- 1 bgpalerter bgpalerter 807 Aug 14 21:36 .profile
4.0K drwxr-xr-x 3 bgpalerter bgpalerter 4.0K Aug 14 21:45 src
4.0K -rw-r--r-- 1 bgpalerter bgpalerter 165 Aug 14 21:36 .wget-hsts
bgpalerter@vultr:~$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 472M 0 472M 0% /dev
tmpfs 98M 668K 97M 1% /run
/dev/vda1 24G 5.0G 18G 23% /
tmpfs 489M 0 489M 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 98M 0 98M 0% /run/user/0
config
bgpalerter@vultr:~$ cat config.yml
connectors:
- file: connectorRIS
name: ris
params:
carefulSubscription: true
url: ws://ris-live.ripe.net/v1/ws/
perMessageDeflate: true
authorizationHeader: null
subscription:
moreSpecific: true
type: UPDATE
host:
socketOptions:
includeRaw: false
# - file: connectorRISDump
# name: dmp
monitors:
- file: monitorHijack
channel: hijack
name: basic-hijack-detection
params:
thresholdMinPeers: 3
- file: monitorNewPrefix
channel: newprefix
name: prefix-detection
params:
thresholdMinPeers: 3
- file: monitorPath
channel: path
name: path-matching
params:
thresholdMinPeers: 1
- file: monitorVisibility
channel: visibility
name: withdrawal-detection
params:
thresholdMinPeers: 40
- file: monitorAS
channel: misconfiguration
name: asn-monitor
params:
thresholdMinPeers: 3
- file: monitorRPKI
channel: rpki
name: rpki-monitor
params:
thresholdMinPeers: 3
checkUncovered: false
checkDisappearing: false
- file: monitorROAS
channel: roa
name: rpki-diff
params:
enableDiffAlerts: true
enableExpirationAlerts: true
enableExpirationCheckTA: true
enableDeletedCheckTA: true
roaExpirationAlertHours: 2
checkOnlyASns: true
toleranceDeletedRoasTA: 20
toleranceExpiredRoasTA: 20
- file: monitorPathNeighbors
channel: path
name: path-neighbors
params:
thresholdMinPeers: 3
reports:
- file: reportFile
channels:
- hijack
- newprefix
- visibility
- path
- misconfiguration
- rpki
- roa
params:
persistAlertData: false
alertDataDirectory: alertdata/
- file: reportEmail
channels:
- hijack
- newprefix
- visibility
- path
- misconfiguration
- rpki
- roa
params:
showPaths: 10 # Amount of AS_PATHs to report in the alert
senderEmail: rafael@thinkpad.io
smtp:
host: smtp.server.org
port: 465
secure: true # If true the connection will use TLS when connecting to server. If false it will be still possible doing connection upgrade via STARTTLS
ignoreTLS: false # If true TLS will be completely disabled, including STARTTLS. Set this to true if you see certificate errors in the logs.
auth:
user: user@email.com
pass: password
type: login
tls:
rejectUnauthorized: true # Reject unauthorized certificates
notifiedEmails:
default:
- bgpalerts@email2.net
noc:
- bgpalerts@email2.net
############################
# Notification settings:
# - notificationIntervalSeconds
# Defines the amount of seconds after which an alert can be repeated. An alert is repeated only if the event that
# triggered it is not yet solved.
# - persistStatus
# Persist the status of BGPalerter. If the process is restarted, the list of alerts already sent is recovered
# and they are not repeated. The process must be able to write on disc, this option will create a file inside .cache/
notificationIntervalSeconds: 3600
persistStatus: true
############################
# REST API settings:
# - rest.host
# The IP address the server will listen. The default value is localhost, this means the API will not be reachable
# from another host. To make it public use null or 0.0.0.0.
# - rest.port
# The port the server will listen
rest:
host: localhost
port: 8011
logging:
directory: logs
logRotatePattern: MM-DD-YYYY
maxRetainedFiles: 10
maxFileSizeMB: 15
compressOnRotation: false
useUTC: false
checkForUpdatesAtBoot: false
generatePrefixListEveryDays: 1
############################
# Process monitoring settings:
# Uncomment or add classes under processMonitors if you want to monitor or send logs about the status of the BGPalerter process
processMonitors:
# - file: uptimeApi
# params:
# useStatusCodes: true
#
- file: uptimeHealthcheck
params:
url: https://hc-ping.com/9df2b8a9-d2bf-4894-ad68-a849ce8de3e0
intervalSeconds: 60
method: get
#
# - file: sentryModule
# params:
# dsn: https://<key>@sentry.io/<project>
############################
# The files containing the monitored prefixes. Please see prefixes.yml for an example.
# This is an array (use new lines and dashes!)
monitoredPrefixesFiles:
- prefixes.yml
############################
# The file containing the user groups.
# User groups can be specified
# 1) directly above, in each report module; or
# 2) inside an external file, as specified below (disabled by default).
# Using an external file allows BGPalerter to auto-reload the user group definitions
# when the external file is changed.
# groupsFile: groups.yml.example
############################
# HTTP proxy setting:
# Allow to run BGPalerter behind an HTTP/HTTPS proxy.
# You can also specify which module can bypass the proxy.
# More information here: https://github.com/nttgin/BGPalerter/blob/main/docs/http-proxy.md
# httpProxy: http://username:password@127.0.0.1:9000
############################
# RPKI settings:
# Global RPKI settings shared across all monitors requiring RPKI data
# More information here: https://github.com/nttgin/BGPalerter/blob/main/docs/rpki.md
#
# To enable ROA expiration alerts, you need VRPs including expiration timestamps.
# "rpkiclient" is the default vrpProvider since is the only one supporting ROAs expiration data.
rpki:
vrpProvider: rpkiclient
preCacheROAs: true
refreshVrpListMinutes: 15
markDataAsStaleAfterMinutes: 120
############################
# Advanced settings (Don't touch here!)
# Please, refer to the documentation to know the meaning of the following parameters.
alertOnlyOnce: false
fadeOffSeconds: 360
checkFadeOffGroupsSeconds: 30
pidFile: bgpalerter.pid
maxMessagesPerSecond: 6000
multiProcess: false
environment: production
configVersion: 2
prefixes.yml
bgpalerter@vultr:~$ cat prefixes.yml
192.154.3.0/24:
description: No description provided
asn:
- 394414
ignoreMorespecifics: false
ignore: false
group: noc
2604:2980:f603::/48:
description: No description provided (No ROA available)
asn:
- 394414
ignoreMorespecifics: false
ignore: false
group: noc
2a06:a005:2d8::/48:
description: No description provided
asn:
- 394414
ignoreMorespecifics: false
ignore: false
group: noc
2a06:a005:ae0::/44:
description: No description provided
asn:
- 394414
ignoreMorespecifics: false
ignore: false
group: noc
2a06:c3c0:6::/48:
description: No description provided
asn:
- 394414
ignoreMorespecifics: false
ignore: false
group: noc
2a06:1180:1::/48:
description: No description provided
asn:
- 394414
ignoreMorespecifics: false
ignore: false
group: noc
options:
monitorASns:
'394414':
group: noc
upstreams:
- 32621
- 44570
- 50058
- 62943
- 6939
downstreams:
- 968
generate:
exclude: []
excludeDelegated: true
monitoredASes:
- '394414'
historical: false
group: noc
asnList:
- '394414'
I don't see a directory 'alertdata' created when running the binary for the first time... Is it required? It's on the config.
There's nothing else installed on this VM (as well as on the container that's running my production alerter for AS62943). Essentially it's a fresh Debian 11 install with just some apt packages (wget, curl, whois, net-tools, dnsutils, mtr-tiny, etc).
I am going to perform the 2 debug steps you shared above, and reply back here in an hour or two with updates. Thanks again, really appreciate it!
EDIT: the pre-release version shows same version when running with flag -v:
bgpalerter@vultr:~$ ./bgpalerter-linux-x64 -v
1.30.0
bgpalerter@vultr:~$ ./bgpalerter-linux-x64-old -v
1.30.0
Hi @massimocandela :)
Here is an update, with the pre-release binary and the config change you suggested, looks like the process survives beyond 10 minutes:
Every 1.0s: systemctl status bgpalerter.service vultr: Tue Aug 16 06:36:35 2022
● bgpalerter.service - BGPalerter
Loaded: loaded (/etc/systemd/system/bgpalerter.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2022-08-16 06:06:52 CDT; 29min ago
Main PID: 155512 (bgpalerter-linu)
Tasks: 10 (limit: 1132)
Memory: 203.4M
CPU: 44.495s
CGroup: /system.slice/bgpalerter.service
└─155512 /home/bgpalerter/bgpalerter-linux-x64
Aug 16 06:06:52 vultr systemd[1]: Started BGPalerter.
Aug 16 06:06:53 vultr bgpalerter-linux-x64[155512]: Loaded config: /home/bgpalerter/config.yml
Aug 16 06:06:53 vultr bgpalerter-linux-x64[155512]: BGPalerter, version: 1.30.0 environment: production
Aug 16 06:07:05 vultr bgpalerter-linux-x64[155512]: Monitoring 192.154.3.0/24
Aug 16 06:07:05 vultr bgpalerter-linux-x64[155512]: Monitoring 2a06:a005:ae0::/44
Aug 16 06:07:05 vultr bgpalerter-linux-x64[155512]: Monitoring 2a06:1180:1::/48
Aug 16 06:07:05 vultr bgpalerter-linux-x64[155512]: Monitoring 2a06:c3c0:6::/48
Aug 16 06:07:05 vultr bgpalerter-linux-x64[155512]: Monitoring 2a06:a005:2d8::/48
Aug 16 06:07:05 vultr bgpalerter-linux-x64[155512]: Monitoring 2604:2980:f603::/48
Aug 16 06:07:05 vultr bgpalerter-linux-x64[155512]: Monitoring AS394414
I am going to test the stable version with the same config change, and see if there are any different results. I'll report back shortly.
EDIT: the pre-release version shows same version when running with flag -v:
bgpalerter@vultr:~$ ./bgpalerter-linux-x64 -v 1.30.0 bgpalerter@vultr:~$ ./bgpalerter-linux-x64-old -v 1.30.0
Opssss, download the binary again now, it should not change anything (I think only the release number bump was missing), but just in case...
Thanks for doing the test. Please, check also with the new binary if it crashes when you set enableAdvancedRpkiStats
to true.
No problem!
Leaving the config change in place, but using the current stable binary, I am still seeing the process restart every 10min.
I am going to download the pre-release 1.30.1 again, and test it with enableAdvancedRpkiStats
to true.
EDIT: the version is now fixed since you reissued the binary :)
bgpalerter@vultr:~$ ./bgpalerter-linux-x64 -v
1.30.1
I am back @massimocandela - Looks like setting enableAdvancedRpkiStats
to true with the pre-release 1.30.1 version causes the restarts at every 10m, where as with it set to false seems to get rid of this symptom.
Checking the error.log file while set to true, I don't see anything "new":
2022-08-16T06:59:54-05:00 error: RIPE RIS disconnected (error: process termination). Read more at https://github.com/nttgin/BGPalerter/blob/main/docs/ris-disconnections.md
2022-08-16T07:01:41-05:00 info: ris connector connected (instance:fe37007c-5254-4eb1-8ba2-5e481d609a05 connection:67aa60f7-2da6-436f-9982-bfebf4d4c9f9)
2022-08-16T07:01:41-05:00 info: Subscribed to monitored resources
2022-08-16T07:01:41-05:00 info: Subscribed to beacons
2022-08-16T07:12:27-05:00 info: ris connector connected (instance:26a806a2-0f72-4473-b340-74ab2aee06cf connection:0eed3552-7a62-4e72-ac57-1340c41a3e0c)
2022-08-16T07:12:27-05:00 info: Subscribed to monitored resources
2022-08-16T07:12:27-05:00 info: Subscribed to beacons
...but, when checking /var/log/daemon.log I do see more information:
Aug 16 11:39:11 vultr systemd[1]: Started BGPalerter.
Aug 16 11:39:12 vultr bgpalerter-linux-x64[160961]: Loaded config: /home/bgpalerter/config.yml
Aug 16 11:39:12 vultr bgpalerter-linux-x64[160961]: BGPalerter, version: 1.30.0 environment: production
Aug 16 11:39:22 vultr bgpalerter-linux-x64[160961]: Monitoring 192.154.3.0/24
Aug 16 11:39:22 vultr bgpalerter-linux-x64[160961]: Monitoring 2a06:a005:ae0::/44
Aug 16 11:39:22 vultr bgpalerter-linux-x64[160961]: Monitoring 2a06:1180:1::/48
Aug 16 11:39:22 vultr bgpalerter-linux-x64[160961]: Monitoring 2a06:c3c0:6::/48
Aug 16 11:39:22 vultr bgpalerter-linux-x64[160961]: Monitoring 2a06:a005:2d8::/48
Aug 16 11:39:22 vultr bgpalerter-linux-x64[160961]: Monitoring 2604:2980:f603::/48
Aug 16 11:39:22 vultr bgpalerter-linux-x64[160961]: Monitoring AS394414
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: <--- Last few GCs --->
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: [160961:0x3074490] 611643 ms: Mark-sweep (reduce) 486.5 (491.7) -> 485.1 (492.5) MB, 557.1 / 0.0 ms (+ 65.4 ms in 28 steps since start of marking, biggest step 17.2 ms, walltime since start of marking 735 ms) (average mu = 0.299, current mu = 0.245) al
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: <--- JS stacktrace --->
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: 1: 0x8e08f0 node::Abort() [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: 2: 0x8297f7 node::FatalError(char const*, char const*) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: 3: 0x9f48d6 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: 4: 0x9f4b6d v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: 5: 0xb96c35 [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: 6: 0xb97687 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: 7: 0xba0208 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: 8: 0xba3830 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: 9: 0x16b634d [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: 10: 0xb679d3 v8::internal::FactoryBase<v8::internal::Factory>::NewRawOneByteString(int, v8::internal::AllocationType) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: 11: 0xc7f147 v8::internal::JsonParser<unsigned char>::MakeString(v8::internal::JsonString const&, v8::internal::Handle<v8::internal::String>) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: 12: 0xc81a82 v8::internal::JsonParser<unsigned char>::ParseJsonValue() [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: 13: 0xc862a9 v8::internal::JsonParser<unsigned char>::Parse(v8::internal::Isolate*, v8::internal::Handle<v8::internal::String>, v8::internal::Handle<v8::internal::Object>) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: 14: 0xa9ad7b v8::internal::Builtin_JsonParse(int, unsigned long*, v8::internal::Isolate*) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 11:49:24 vultr bgpalerter-linux-x64[160961]: 15: 0x176dd19 [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 11:49:24 vultr systemd[1]: bgpalerter.service: Main process exited, code=killed, status=6/ABRT
Aug 16 11:49:24 vultr systemd[1]: bgpalerter.service: Failed with result 'signal'.
Aug 16 11:49:24 vultr systemd[1]: bgpalerter.service: Consumed 29.348s CPU time.
Aug 16 11:49:54 vultr systemd[1]: bgpalerter.service: Scheduled restart job, restart counter is at 1.
Aug 16 11:49:54 vultr systemd[1]: Stopped BGPalerter.
Aug 16 11:49:54 vultr systemd[1]: bgpalerter.service: Consumed 29.348s CPU time.
Aug 16 11:49:54 vultr systemd[1]: Started BGPalerter.
Aug 16 11:49:54 vultr bgpalerter-linux-x64[162872]: Loaded config: /home/bgpalerter/config.yml
Aug 16 11:49:55 vultr bgpalerter-linux-x64[162872]: BGPalerter, version: 1.30.0 environment: production
Aug 16 11:50:08 vultr bgpalerter-linux-x64[162872]: Monitoring 192.154.3.0/24
Aug 16 11:50:08 vultr bgpalerter-linux-x64[162872]: Monitoring 2a06:a005:ae0::/44
Aug 16 11:50:08 vultr bgpalerter-linux-x64[162872]: Monitoring 2a06:1180:1::/48
Aug 16 11:50:08 vultr bgpalerter-linux-x64[162872]: Monitoring 2a06:c3c0:6::/48
Aug 16 11:50:08 vultr bgpalerter-linux-x64[162872]: Monitoring 2a06:a005:2d8::/48
Aug 16 11:50:08 vultr bgpalerter-linux-x64[162872]: Monitoring 2604:2980:f603::/48
Aug 16 11:50:08 vultr bgpalerter-linux-x64[162872]: Monitoring AS394414
Aug 16 11:59:54 vultr systemd[1]: Stopping BGPalerter...
Aug 16 11:59:54 vultr systemd[1]: bgpalerter.service: Succeeded.
Aug 16 11:59:54 vultr systemd[1]: Stopped BGPalerter.
Aug 16 11:59:54 vultr systemd[1]: bgpalerter.service: Consumed 18.565s CPU time.
Aug 16 12:01:25 vultr systemd[1]: Started BGPalerter.
Aug 16 12:01:25 vultr bgpalerter-linux-x64[164664]: Loaded config: /home/bgpalerter/config.yml
Aug 16 12:01:25 vultr bgpalerter-linux-x64[164664]: BGPalerter, version: 1.30.1 environment: production
Aug 16 12:01:41 vultr bgpalerter-linux-x64[164664]: Monitoring 192.154.3.0/24
Aug 16 12:01:41 vultr bgpalerter-linux-x64[164664]: Monitoring 2a06:a005:ae0::/44
Aug 16 12:01:41 vultr bgpalerter-linux-x64[164664]: Monitoring 2a06:1180:1::/48
Aug 16 12:01:41 vultr bgpalerter-linux-x64[164664]: Monitoring 2a06:c3c0:6::/48
Aug 16 12:01:41 vultr bgpalerter-linux-x64[164664]: Monitoring 2a06:a005:2d8::/48
Aug 16 12:01:41 vultr bgpalerter-linux-x64[164664]: Monitoring 2604:2980:f603::/48
Aug 16 12:01:41 vultr bgpalerter-linux-x64[164664]: Monitoring AS394414
Aug 16 12:11:43 vultr bgpalerter-linux-x64[164664]: <--- Last few GCs --->
Aug 16 12:11:43 vultr bgpalerter-linux-x64[164664]: [164664:0x348b490] 616638 ms: Mark-sweep (reduce) 485.9 (491.4) -> 484.2 (491.9) MB, 674.4 / 0.0 ms (+ 0.0 ms in 4 steps since start of marking, biggest step 0.0 ms, walltime since start of marking 687 ms) (average mu = 0.256, current mu = 0.140) alloc[164664:0x348b490] 617450 ms: Mark-sweep (reduce) 485.5 (491.9) -> 485.1 (492.7) MB, 779.1 / 0.0 ms (average mu = 0.153, current mu = 0.041) allocation failure scavenge might not succeed
Aug 16 12:11:43 vultr bgpalerter-linux-x64[164664]: <--- JS stacktrace --->
Aug 16 12:11:43 vultr bgpalerter-linux-x64[164664]: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
Aug 16 12:11:43 vultr bgpalerter-linux-x64[164664]: 1: 0x8e08f0 node::Abort() [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:11:43 vultr bgpalerter-linux-x64[164664]: 2: 0x8297f7 node::FatalError(char const*, char const*) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:11:43 vultr bgpalerter-linux-x64[164664]: 3: 0x9f48d6 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:11:43 vultr bgpalerter-linux-x64[164664]: 4: 0x9f4b6d v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:11:43 vultr bgpalerter-linux-x64[164664]: 5: 0xb96c35 [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:11:43 vultr bgpalerter-linux-x64[164664]: 6: 0xb97687 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:11:43 vultr bgpalerter-linux-x64[164664]: 7: 0xba0208 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:11:43 vultr bgpalerter-linux-x64[164664]: 8: 0xba3830 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:11:43 vultr bgpalerter-linux-x64[164664]: 9: 0xb6d072 v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationType, v8::internal::AllocationOrigin) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:11:43 vultr bgpalerter-linux-x64[164664]: 10: 0xe71fee v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:11:43 vultr bgpalerter-linux-x64[164664]: 11: 0x176dc39 [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:11:43 vultr systemd[1]: bgpalerter.service: Main process exited, code=killed, status=6/ABRT
Aug 16 12:11:43 vultr systemd[1]: bgpalerter.service: Failed with result 'signal'.
Aug 16 12:11:43 vultr systemd[1]: bgpalerter.service: Consumed 32.830s CPU time.
Aug 16 12:12:13 vultr systemd[1]: bgpalerter.service: Scheduled restart job, restart counter is at 1.
Aug 16 12:12:13 vultr systemd[1]: Stopped BGPalerter.
Aug 16 12:12:13 vultr systemd[1]: bgpalerter.service: Consumed 32.830s CPU time.
Aug 16 12:12:13 vultr systemd[1]: Started BGPalerter.
Aug 16 12:12:14 vultr bgpalerter-linux-x64[166589]: Loaded config: /home/bgpalerter/config.yml
Aug 16 12:12:14 vultr bgpalerter-linux-x64[166589]: BGPalerter, version: 1.30.1 environment: production
Aug 16 12:12:27 vultr bgpalerter-linux-x64[166589]: Monitoring 192.154.3.0/24
Aug 16 12:12:27 vultr bgpalerter-linux-x64[166589]: Monitoring 2a06:a005:ae0::/44
Aug 16 12:12:27 vultr bgpalerter-linux-x64[166589]: Monitoring 2a06:1180:1::/48
Aug 16 12:12:27 vultr bgpalerter-linux-x64[166589]: Monitoring 2a06:c3c0:6::/48
Aug 16 12:12:27 vultr bgpalerter-linux-x64[166589]: Monitoring 2a06:a005:2d8::/48
Aug 16 12:12:27 vultr bgpalerter-linux-x64[166589]: Monitoring 2604:2980:f603::/48
Aug 16 12:12:27 vultr bgpalerter-linux-x64[166589]: Monitoring AS394414
Aug 16 12:22:27 vultr bgpalerter-linux-x64[166589]: <--- Last few GCs --->
Aug 16 12:22:27 vultr bgpalerter-linux-x64[166589]: [166589:0x4701490] 611807 ms: Scavenge 485.1 (490.4) -> 483.9 (490.7) MB, 2.0 / 0.0 ms (average mu = 0.322, current mu = 0.307) allocation failure
Aug 16 12:22:27 vultr bgpalerter-linux-x64[166589]: [166589:0x4701490] 612552 ms: Mark-sweep 485.6 (490.9) -> 483.5 (491.4) MB, 737.4 / 0.0 ms (average mu = 0.233, current mu = 0.135) allocation failure scavenge might not succeed
Aug 16 12:22:27 vultr bgpalerter-linux-x64[166589]: <--- JS stacktrace --->
Aug 16 12:22:27 vultr bgpalerter-linux-x64[166589]: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
Aug 16 12:22:27 vultr bgpalerter-linux-x64[166589]: 1: 0x8e08f0 node::Abort() [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:22:27 vultr bgpalerter-linux-x64[166589]: 2: 0x8297f7 node::FatalError(char const*, char const*) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:22:27 vultr bgpalerter-linux-x64[166589]: 3: 0x9f48d6 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:22:27 vultr bgpalerter-linux-x64[166589]: 4: 0x9f4b6d v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:22:27 vultr bgpalerter-linux-x64[166589]: 5: 0xb96c35 [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:22:27 vultr bgpalerter-linux-x64[166589]: 6: 0xb97687 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:22:27 vultr bgpalerter-linux-x64[166589]: 7: 0xba0208 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:22:27 vultr bgpalerter-linux-x64[166589]: 8: 0xba3830 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:22:27 vultr bgpalerter-linux-x64[166589]: 9: 0xb6d072 v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationType, v8::internal::AllocationOrigin) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:22:27 vultr bgpalerter-linux-x64[166589]: 10: 0xe71fee v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:22:27 vultr bgpalerter-linux-x64[166589]: 11: 0x176dc39 [/home/bgpalerter/bgpalerter-linux-x64]
Aug 16 12:22:27 vultr systemd[1]: bgpalerter.service: Main process exited, code=killed, status=6/ABRT
Aug 16 12:22:27 vultr systemd[1]: bgpalerter.service: Failed with result 'signal'.
Aug 16 12:22:27 vultr systemd[1]: bgpalerter.service: Consumed 31.753s CPU time.
Aug 16 12:22:57 vultr systemd[1]: bgpalerter.service: Scheduled restart job, restart counter is at 2.
Aug 16 12:22:57 vultr systemd[1]: Stopped BGPalerter.
Aug 16 12:22:57 vultr systemd[1]: bgpalerter.service: Consumed 31.753s CPU time.
Aug 16 12:22:57 vultr systemd[1]: Started BGPalerter.
Aug 16 12:22:58 vultr bgpalerter-linux-x64[168077]: Loaded config: /home/bgpalerter/config.yml
Aug 16 12:22:58 vultr bgpalerter-linux-x64[168077]: BGPalerter, version: 1.30.1 environment: production
Aug 16 12:23:12 vultr bgpalerter-linux-x64[168077]: Monitoring 192.154.3.0/24
Aug 16 12:23:12 vultr bgpalerter-linux-x64[168077]: Monitoring 2a06:a005:ae0::/44
Aug 16 12:23:12 vultr bgpalerter-linux-x64[168077]: Monitoring 2a06:1180:1::/48
Aug 16 12:23:12 vultr bgpalerter-linux-x64[168077]: Monitoring 2a06:c3c0:6::/48
Aug 16 12:23:12 vultr bgpalerter-linux-x64[168077]: Monitoring 2a06:a005:2d8::/48
Aug 16 12:23:12 vultr bgpalerter-linux-x64[168077]: Monitoring 2604:2980:f603::/48
Aug 16 12:23:12 vultr bgpalerter-linux-x64[168077]: Monitoring AS394414
Since I noticed memory consumption spikes at the end of the 10m run, I wonder if there's a memory allocation issue of some sort? Not sure if these are correlated, just a wild guess.
The memory allocation is correlated. The enableAdvancedRpkiStats
is a memory intensive option. I had the suspect it was this feature because it triggers some stats every 10 minutes.
Does it happen only when started with systemd?
edit: does it happen only on vultr? or also locally/other cloud provider?
note process memory usage at about 10m uptime, host has 1GB ram
I may have skipped this! 1Gb ram is not enough. Possibly you have in the system 500Mb free, the old version of BGPalerter required 1~1.2Gb, the new one requires 1.5Gb when enableAdvancedRpkiStats
is true. So I would install it on a 4Gb vm. If you set enableAdvancedRpkiStats
to false, with some swap it will work with 1Gb.
It's a good topic tho, I need to write some requirements on the main page :)
The same issue was replicated on my LXC container (debian11) running on Proxmox, so I don't think it is just on Vultr.
Here are some stats of the container, showing some memory utilization spikes every so often (coinciding with CPU and bandwidth): https://i.imgur.com/KMI5Mvf.png
note process memory usage at about 10m uptime, host has 1GB ram
I may have skipped this! 1Gb ram is not enough. Possibly you have in the system 500Mb free, the old version of BGPalerter required 1~1.2Gb, the new one requires 1.5Gb when
enableAdvancedRpkiStats
is true. So I would install it on a 4Gb vm. If you setenableAdvancedRpkiStats
to false, with some swap it will work with 1Gb.It's a good topic tho, I need to write some requirements on the main page :)
Ah, I can also try with more RAM :) I will keep you updated!
Update: @massimocandela I've increased the production container RAM to 2GB (running stable 1.3.0) and set enableAdvancedRpkiStats
to false. It has been running for almost 3 hours now. This seems to have solved the issue :)
EDIT: perhaps the enableAdvancedRpkiStats
could be set to false by default if auto-generated, but with a comment suggesting additional RAM is needed? Also, VM resources seem more stable since then: https://i.imgur.com/R7ApGHU.png
Ah, I can also try with more RAM :) I will keep you updated!
In the meanwhile, I'm doing some additional tests. 2GB should be enough to run everything without features disabled. The numbers I gave you before refer to virtual memory (recorded values for physical memory are much smaller, ~800MB). These values are from our monitoring for the entire NTT network. I think is a reasonable amount.
Update: @massimocandela I've increased the production container RAM to 2GB (running stable 1.3.0) and set
enableAdvancedRpkiStats
to false. It has been running for almost 3 hours now. This seems to have solved the issue :)
Could you try with enableAdvancedRpkiStats
enabled? If you are dedicating the VM to BGPalerter, and there are no other processes using serious memory, you should be able to run it.
EDIT: perhaps the
enableAdvancedRpkiStats
could be set to false by default if auto-generated, but with a comment suggesting additional RAM is needed?
I thought about it, but better not. I prefer to default to the safest configuration possible, and provide the most accurate and actionable alerts possible. It's better if people disable what they do not want than to enable what they probably don't even know it exists (up to when something happens and they wonder why they didn't get an alert).
Also, VM resources seem more stable since then: https://i.imgur.com/R7ApGHU.png
Yes, because during boot, BGPalerter does a bunch of initial checks and indexing. By restarting the process every 10 minutes everything was repeated over and over.
Appreciate your thorough responses! Definitely understand your points.
Could you try with
enableAdvancedRpkiStats
enabled? If you are dedicating the VM to BGPalerter, and there are no other processes using serious memory, you should be able to run it.
Yes, no problem, I'll report back.
Just wanted to note I ran into this same issue. Was running on a t2.micro instance in AWS. It would lock up due to memory usage. Switched to a t2.medium and it's fine now.
I would also suggest some way to detect lower than recommended RAM in the logs at startup. Something like "WARNING: System RAM lower than recommended, please see $url" and point to the new requirements page.
On a 2GB RAM VM with enableAdvancedRpkiStats: false
monitoring 15 something prefixes leads to periodic crashes, either at after 2 hours or after 4 hours of runtime.
I'm assuming this is expected behavior, but it still sounds like a lot for monitoring just a few prefixes, also it's strange that it crashes quasi exactly at 2 or 4 hours of runtime.
Hi @lukastribus, this is not the expected behavior. 2GB are more than enough, especially with enableAdvancedRpkiStats: false
. What log/error.log and /var/log/daemon.log say? What your usage of RAM? What OS and BGPalerter version? Are you sure it's a crash and not the process getting killed?
It's Ubuntu 20.04.5 LTS, "free -m" shows > 1300MB of available memory.
From the journalctl output I can see node calling abort() due to memory allocation failure. The strange thing is the fixed 2 or 4 hours interval. I'm sending you configuration and full logs via email.
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: <--- Last few GCs --->
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: [274925:0x2bdd490] 7204014 ms: Scavenge 761.4 (781.4) -> 761.4 (781.4) MB, 24.2 / 0.0 ms (average mu = 0.997, current mu = 0.999) allocation failure
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: [274925:0x2bdd490] 7204641 ms: Mark-sweep 1016.4 (1036.5) -> 908.7 (928.1) MB, 168.9 / 0.0 ms (+ 37.0 ms in 4 steps since start of marking, biggest step 18.6 ms, walltime since start of marking 1543 ms) (average mu = 0.993, current mu = 0.894) allocatio
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: <--- JS stacktrace --->
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: FATAL ERROR: MarkCompactCollector: young object promotion failed Allocation failed - JavaScript heap out of memory
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 1: 0x8e08f0 node::Abort() [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 2: 0x8297f7 node::FatalError(char const*, char const*) [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 3: 0x9f48d6 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 4: 0x9f4b6d v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 5: 0xb96c35 [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 6: 0xbb683e [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 7: 0xbd363f v8::internal::FullEvacuator::RawEvacuatePage(v8::internal::MemoryChunk*, long*) [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 8: 0xbcdb6a v8::internal::Evacuator::EvacuatePage(v8::internal::MemoryChunk*) [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 9: 0xbd37a8 [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 10: 0xbb3689 v8::internal::ItemParallelJob::Run() [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 11: 0xbd4a23 v8::internal::MarkCompactCollector::EvacuatePagesInParallel() [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 12: 0xbd52c6 v8::internal::MarkCompactCollector::Evacuate() [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 13: 0xbb7d09 v8::internal::MarkCompactCollector::CollectGarbage() [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 14: 0xb9f3dc v8::internal::Heap::MarkCompact() [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 15: 0xba0628 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 16: 0xba3830 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 17: 0x16b634d [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 18: 0xb67d42 v8::internal::FactoryBase<v8::internal::Factory>::AllocateRawArray(int, v8::internal::AllocationType) [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 19: 0xb67e80 v8::internal::FactoryBase<v8::internal::Factory>::NewFixedArrayWithFiller(v8::internal::Handle<v8::internal::Map>, int, v8::internal::Handle<v8::internal::Oddball>, v8::internal::AllocationType) [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 20: 0xcf79ec [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 21: 0xe45a9a v8::internal::Runtime_GrowArrayElements(int, unsigned long*, v8::internal::Isolate*) [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:21:59 htznr2 bgpalerter-linux-x64[274925]: 22: 0x176dc39 [/home/bgpalerter/bgpalerter-linux-x64]
Nov 24 16:22:00 htznr2 systemd[1]: bgpalerter.service: Main process exited, code=dumped, status=6/ABRT
Thanks
I am not sure if this is expected behavior, so figured I'd open an issue.
Description When running as a service with systemd, the bgpalerter process will restart on its own every ~10 minutes.
Systemd service config and log example
Expected behavior Process to remain running unless stopped.
Are you using the binary or the source code? linux x64 binary on Debian 11
Your information AS62943