Closed InsaneSplash closed 1 year ago
I uploaded a small patch. I don't think it's going to solve the problem, but you might as well try it.
Are you using --output.roa
?
If you enable it, do you get a slightly different error mesage?
Can you please post your fort
command, with flags (and configuration file, if applies) included?
Hey, sorry for the late reply..... another instance just crashed.
Command Line:
/usr/bin/fort --configuration-file /etc/fort/config.json
Config file:
{
"tal": "/etc/fort/tal",
"local-repository": "/var/lib/fort/repository",
"slurm": "/etc/fort/slurm",
"server": {
"port": "3323",
"interval": {
"validation": 3600,
"refresh": 3600,
"retry": 600,
"expire": 7200
}
},
"log": {
"output": "syslog"
}
}
May 27 07:59:16 fort[98190]: /usr/bin/fort[0x417d97]
May 27 07:59:16 fort[98190]: /lib64/libpthread.so.0(+0x12c30)[0x7f6d27f1cc30]
May 27 07:59:16 fort[98190]: /usr/bin/fort(x509_name_put+0x0)[0x427dc0]
May 27 07:59:16 fort[98190]: /usr/bin/fort[0x4143cc]
May 27 07:59:16 fort[98190]: /usr/bin/fort[0x4144ac]
May 27 07:59:16 fort[98190]: /usr/bin/fort(deferstack_pop+0x3b)[0x4146eb]
May 27 07:59:16 fort[98190]: /usr/bin/fort[0x428cc4]
May 27 07:59:16 fort[98190]: /usr/bin/fort[0x4296c9]
May 27 07:59:16 fort[98190]: /usr/bin/fort[0x437307]
May 27 07:59:16 fort[98190]: /lib64/libpthread.so.0(+0x818a)[0x7f6d27f1218a]
May 27 07:59:16 fort[98190]: /lib64/libc.so.6(clone+0x43)[0x7f6d27c41dd3]
Interesting the process provides a stack trace if you provide it a unknown option.
May 31 10:16:17 fort[916765]: ERR: Unrecognized option: 63
May 31 10:16:17 fort[916765]: Stack trace:
May 31 10:16:17 fort[916765]: fort(print_stack_trace+0x1f) [0x417e5f]
May 31 10:16:17 fort[916765]: fort(__pr_op_err+0x84) [0x418424]
May 31 10:16:17 fort[916765]: fort(handle_flags_config+0x315) [0x416145]
May 31 10:16:17 fort[916765]: fort(main+0x66) [0x413d66]
May 31 10:16:17 fort[916765]: /lib64/libc.so.6(__libc_start_main+0xf3) [0x7f59ef759493]
May 31 10:16:17 fort[916765]: fort(_start+0x2e) [0x413e9e]
May 31 10:16:17 fort[916765]: (End of stack trace)
May 31 10:16:17 fort[916765]: ERR: Try 'fort --usage' or 'fort --help' for more information.
May 31 10:16:17 fort[916765]: Stack trace:
May 31 10:16:17 fort[916765]: fort(print_stack_trace+0x1f) [0x417e5f]
May 31 10:16:17 fort[916765]: fort(__pr_op_err+0x84) [0x418424]
May 31 10:16:17 fort[916765]: fort(handle_flags_config+0x33b) [0x41616b]
May 31 10:16:17 fort[916765]: fort(main+0x66) [0x413d66]
May 31 10:16:17 fort[916765]: /lib64/libc.so.6(__libc_start_main+0xf3) [0x7f59ef759493]
May 31 10:16:17 fort[916765]: fort(_start+0x2e) [0x413e9e]
May 31 10:16:17 fort[916765]: (End of stack trace)
I also getting this crash regularly, but with Unknown protocol: 0
Ive left the service running with no BGP services using it and lost 2 instances this weekend.
note: dont update librtr to version 8
Do you have files in the SLURM directory? (/etc/fort/slurm
)
If so, can I have them? (It's fine if you want to censor IPs)
Ok, it looks like this is going to be a difficult bug.
Is either of you willing to run a custom debug-heavy Fort binary?
I will do that, no problem
This is all I have in that file
{
"slurmVersion": 1,
"validationOutputFilters": {
"prefixFilters": [],
"bgpsecFilters": []
},
"locallyAddedAssertions": {
"prefixAssertions": [],
"bgpsecAssertions": []
}
}
Sorry it's taken so long. Debug commit is at branch issue83.
I need the first logging line that contains the string "VRP Corrupted!":
Jul 21 21:21:10 ERR [V]: After standalone: VRP corrupted!
Jul 21 21:21:10 ERR [V]: After SLURM: VRP corrupted!
It shouldn't crash anymore, but I'm not entirely sure what side effects the bogus VRP might induce.
This is all I have in that file
Ok thank you. Probably not the problem either.
Have you gotten any "VRP corrupted!" messages yet?
Just to clarify: The issue83 branch contains a patch that prevents Fort from crashing, but does not, in fact, fix the bug.
Didn't mean to close this.
With us sometimes it crashes after 1 day, sometimes after more than 6 weeks...
(Cannot implement 1.5.4 though because that would require a RPM package. But if I read correctly I understand #83 is not yet resolved in 1.5.4. anyway)
Ok, I managed to apparently successfully generate the RPMs for 1.5.4, and uploaded them here.
(I say "apparently" because CentOS 8's death forced me to migrate to Rocky Linux 8, and I'm not sure if packages generated there will be compatible with other RHELs. Please feedback.)
In other news, I have so far discovered and fixed at least one undefined behavior during the development of 1.5.5, so the bug might already be fixed in the main branch. For your convenience, I packaged this as rpm-1.5.4.1.tar.gz.
Please install either 1.5.4 or 1.5.4.1, and provide the crashing output once it happens. If it never happens, I would also like to know it.
Do you mind tagging 1.5.4 (and 1.5.4.1?) in the repository? This way I will be able to update the Debian package.
Do you mind tagging 1.5.4
What do you mean? It's been tagged since release.
Nevermind: I tought that you had released a new version with the more recent changes. I will wait for the next one, unless you think that I should package a snapshot right now.
RPM 1.5.4-1 package installs fine on RHEL. Thank you. Now running one day, and still up. I'll let you know over a week if still running (or earlier in case of crash)
Well, it looks like it did the trick. No crashes in more than a month. Chapeau and thanks! :)
Hello,
I am picking up that the latest version of FORT 1.5.3 keeps crashing on a regular basis. We has paired FORT with FRRouting which is also running on the latest version on Oracle Linux V8
Below is the extract from the log showing the crashed process.