Closed gtjoseph closed 3 months ago
cherry-pick-to: 18 cherry-pick-to: 20 cherry-pick-to: 21
I'm moving this back to draft because setting any of the profile config fields will wither result in a segfault or the wrong structure member being set. I'm refactoring how the profiles work to prevent the duplication of fields and to more easily prevent disagreements.
Still trying to decide what can be unit tested vs testsuite tested.
Super excited for this!!!!!! You guys rock!!
Super excited for this!!!!!! You guys rock!!
Thanks @lukeescude! Would it be possible for you to test at all?
Super excited for this!!!!!! You guys rock!!
Thanks @lukeescude! Would it be possible for you to test at all?
Absolutely, do you happen to know where I can find the docs for configuring the module? (Not sure if it's still following the same syntax as the original stir_shaken.conf) - I don't mind looking at the code to extrapolate docs, just need to know where.
Just did a one line fix in curl_utils:curler() to fix a crash when curl_easy_perform() returns something other than CURLE_OK. curl_easy_cleanup() was being called twice.
I can successfully build, but asterisk encounters a segfault when booting - the last few messages of the verbose boot are:
Loading plc. == plc => (PLC) Loading udptl. == udptl => (UDPTL) Loading codec_opus.so. == Updated cached format with name 'opus' Segmentation fault (core dumped)
Since Opus is the last thing to be loaded, do you think that's causing an issue?
EDIT: Recompiling with DONT_OPTIMIZE so I can grab some core dump details.
Here is the core dump. I will disable building Opus and see if that helps.
I will disable building Opus and see if that helps.
You can also just do noload => codec_opus to see if it crashes without loading that.
Ah good idea!
Yeah, so both Opus and g729 are borked - I have to either disable them or delete them entirely from the modules directory to allow asterisk to load.
It's also highly probable that I am not checking out 20.6.0 and applying the PR to it properly - Working with tags and PRs is not something I'm familiar with in git.
Ah good idea!
Yeah, so both Opus and g729 are borked - I have to either disable them or delete them entirely from the modules directory to allow asterisk to load.
It's also highly probable that I am not checking out 20.6.0 and applying the PR to it properly - Working with tags and PRs is not something I'm familiar with in git.
That might be because the versions you had are not ABI compatible with your current build configuration. I have no experience with those modules specifically, but I seem to recall them being proprietary (no source released) modules that are downloaded from Digium, although I would expect the build sum to mismatch when the module tries to load, triggering a reject. Maybe somebody else knows more about that...
@lukeescude oops, I missed your comments last week, sorry. As @InterLinked1 mentioned, it's a mismatch. The best way to try the PR on a branch other than master is to download it as a patchfile and apply it to the branch you want...
$ git checkout 20.6.0
$ curl -L https://github.com/asterisk/asterisk/pull/524.patch > ../stir.patch
$ patch -p1 < ../stir.patch
Then just configure, compile and install as you normally would. I can confirm that both codec_opus and codec_g729a load and work fine this way.
Alright, making progress! Looks like some kind of syntax error:
#15 395.9 res_stir_shaken/verification.c: In function 'check_date_header':
#15 395.9 res_stir_shaken/verification.c:673:9: error: expected ';' before 'if'
#15 395.9 673 | if (time_diff < 0) {
#15 395.9 | ^~
#15 395.9 res_stir_shaken/verification.c:678:11: error: 'else' without a previous 'if'
#15 395.9 678 | } else if (time_diff > (ctx->eprofile->vcfg_common.max_date_header_age * 1000)) {
#15 395.9 | ^~~~
#15 395.9 make[1]: *** [/tmp/asterisk-20.6.0/Makefile.rules:165: res_stir_shaken/verification.o] Error 1
#15 395.9 make: *** [Makefile:396: res] Error 2
#15 397.4 [CC] res_stir_shaken/verification.c -> res_stir_shaken/verification.o
#15 397.5 res_stir_shaken/verification.c: In function 'check_date_header':
#15 397.5 res_stir_shaken/verification.c:673:9: error: expected ';' before 'if'
#15 397.5 673 | if (time_diff < 0) {
#15 397.5 | ^~
#15 397.5 res_stir_shaken/verification.c:678:11: error: 'else' without a previous 'if'
#15 397.5 678 | } else if (time_diff > (ctx->eprofile->vcfg_common.max_date_header_age * 1000)) {
#15 397.5 | ^~~~
#15 397.6 make[1]: *** [/tmp/asterisk-20.6.0/Makefile.rules:165: res_stir_shaken/verification.o] Error 1
#15 397.6 make: *** [Makefile:396: res] Error 2
@gtjoseph looking at the code, I am actually not sure why I'm seeing those syntax errors - It's been a while since I've written C but your code looks fine to me. The patch successfully applied too, or at least it gave no errors.
I'll keep digging. We have an automatic build process that may be getting in the way so I'll decouple it from our internal systems and build on my machine locally step-by-step to see if I can uncover what's happening.
I did have to add --with-bundled-libjwt to configure, and additionally had to add pkg-config & libjansson-dev packages before compiling too.
@gtjoseph looking at the code, I am actually not sure why I'm seeing those syntax errors - It's been a while since I've written C but your code looks fine to me. The patch successfully applied too, or at least it gave no errors.
I'll keep digging. We have an automatic build process that may be getting in the way so I'll decouple it from our internal systems and build on my machine locally step-by-step to see if I can uncover what's happening.
I did have to add --with-bundled-libjwt to configure, and additionally had to add pkg-config & libjansson-dev packages before compiling too.
It could be that the patch applied, but in a funny way where it messed some things up. I've seen that happen a few times in the past, where I've had to go in and manually fix the code back in. Happens a lot with Asterisk actually when applying unmerged patches, particularly if they conflict with one another - sometimes it'll just refuse to do anything but other times it will get confused. You might want to see if everything looks good around line 672 in your source.
You will either need to apply the fix that I suggest above, or compile in dev mode:
./configure --enable-dev-mode --with-libjwt-bundled
(ast_trace(...)
in dev mode does not need a trailing ;
, but if it's not in dev mode it does - that should probably be fixed at some point)
I did have to add --with-bundled-libjwt to configure, and additionally had to add pkg-config & libjansson-dev packages before compiling too.
I understand the need for --with-libjwt-bundled and will add that to the UpgradeNote. If you're not using --with-jansson-bundled then I can understand the need for libjansson-dev but that would have been required for an asterisk build even before stir-shaken. If you are using --with-jansson-bundled, then the libjansson-dev package should not have been needed at all. I'll have to investigate that a bit.
The latest force-push contains:
Alright @gtjoseph and @seanbright making progress - got it compiled, but module will not load with the following error:
[Feb 16 20:02:49] DEBUG[345]: loader.c:1834 ast_load_resource: Module res_stir_shaken previously declined to load, unloading it first before loading again
[Feb 16 20:02:49] NOTICE[345]: loader.c:1241 ast_unload_resource: Unloading module 'res_stir_shaken' that previously declined to load
[Feb 16 20:02:49] ERROR[345]: loader.c:964 logged_dlclose: Failure in dlclose for module 'res_stir_shaken': /usr/lib/asterisk/modules/res_stir_shaken.so: shared object not open
[Feb 16 20:02:49] DEBUG[345]: config.c:2377 config_text_file_load: Parsing /etc/asterisk/stir_shaken.conf
[Feb 16 20:02:49] ERROR[345]: res_stir_shaken/crypto_utils.c:176 crypto_register_x509_extension: Couldn't register TNAuthList X509 extension: C00674CEBD7F0000:error:04000066:object identifier routines:OBJ_create:oid exists:../crypto/objects/obj_dat.c:719:
Oh, libjwt 1.7.0 was just released last week but I haven't had a chance to test it yet.
Ah crap. Frickin Alembic.
Alright @gtjoseph and @seanbright making progress - got it compiled, but module will not load with the following error:
[Feb 16 20:02:49] DEBUG[345]: loader.c:1834 ast_load_resource: Module res_stir_shaken previously declined to load, unloading it first before loading again [Feb 16 20:02:49] NOTICE[345]: loader.c:1241 ast_unload_resource: Unloading module 'res_stir_shaken' that previously declined to load [Feb 16 20:02:49] ERROR[345]: loader.c:964 logged_dlclose: Failure in dlclose for module 'res_stir_shaken': /usr/lib/asterisk/modules/res_stir_shaken.so: shared object not open [Feb 16 20:02:49] DEBUG[345]: config.c:2377 config_text_file_load: Parsing /etc/asterisk/stir_shaken.conf [Feb 16 20:02:49] ERROR[345]: res_stir_shaken/crypto_utils.c:176 crypto_register_x509_extension: Couldn't register TNAuthList X509 extension: C00674CEBD7F0000:error:04000066:object identifier routines:OBJ_create:oid exists:../crypto/objects/obj_dat.c:719:
Whoa. "oid exists"??? What version of openssl are you using?
Looks like:
OpenSSL 3.0.11 19 Sep 2023 (Library: OpenSSL 3.0.11 19 Sep 2023)
I'm at 3.1.1 and that oid (TnAuthList: "1.3.6.1.5.5.7.1.26") is not pre-built into openssl. The module isn't somehow being loaded twice is it? In modules.conf, do you have autoload set to yes? Is res_stir_shaken.so listed in the file? Anything else in the logs that may indicate it's being loaded twice?
res_stir_shaken.so STIR/SHAKEN Module for Asterisk 0 Not Running core
I've got autoload set to yes, and res_stir_shaken is not in the file at all, so it should be loading without issue.
module unload/load doesn't seem to make a difference.
I will can look at upgrading openssl to newest. Building on debian 12.
I will can look at upgrading openssl to newest. Building on debian 12.
I believe that would be unnecessary. Can you upload a full log?
I will can look at upgrading openssl to newest. Building on debian 12.
I believe that would be unnecessary. Can you upload a full log?
Yeah, don't upgrade. There's something else going on. What you can do is add...
noload = res_pjsip_stir_shaken.so
noload = res_stir_shaken.so
to modules.conf, restart asterisk, then from the asterisk CLI...
core set trace 5 res_stir_shaken.so
core set debug 5 res_stir_shaken.so
module load res_stir_shaken.so
and post the result.
NOTE: You can't unload res_stir_shaken and load it again. You can only reload. The reason is that once you register an OID to openssl, you can't unregister it so if you try unload the module and load it again, you'll get the exact error message you got.
NOTE: You can't unload res_stir_shaken and load it again. You can only reload. The reason is that once you register an OID to openssl, you can't unregister it so if you try unload the module and load it again, you'll get the exact error message you got.
Then shouldn't the unload handler in the module always return -1, to prevent people from doing that? (We did something similar with a few of the PJSIP modules that had similar underlying limitations).
Then shouldn't the unload handler in the module always return -1, to prevent people from doing that? (We did something similar with a few of the PJSIP modules that had similar underlying limitations).
Yeah probably but I do want the module to clean up other things. There's a lot going on and being able to use the leak sanitizer has been very helpful.
Actually, the check to see if the OID is already registered is flawed. I can fix that and just not try to register it again if it already exists.
Doesn't explain why @lukeescude is getting the message though.
Alright here we go:
874790*CLI> core set trace 5 res_stir_shaken.so
Core trace was 0 and has been set to 5 for 'res_stir_shaken'.
874790*CLI> core set debug 5 res_stir_shaken.so
Core debug was 0 and has been set to 5 for 'res_stir_shaken'.
874790*CLI> module load res_stir_shaken.so
Unable to load module res_stir_shaken.so
Command 'module load res_stir_shaken.so' failed.
[Feb 16 21:46:11] NOTICE[5722]: res_stir_shaken/crypto_utils.c:179 crypto_register_x509_extension: Registering TNAuthList NID 1248
[Feb 16 21:46:11] DEBUG[5722]: res_stir_shaken/common_config.c:313 common_config_load: Stir Shaken Load
[Feb 16 21:46:11] ERROR[5722]: res_stir_shaken/verification_config.c:64 vs_is_config_loaded: No 'verification' section could be found in stir/shaken configuration
[Feb 16 21:46:11] ERROR[5722]: res_stir_shaken/verification.c:1025 vs_load: stir/shaken verification service failed to load
[Feb 16 21:46:11] DEBUG[5722]: res_stir_shaken/common_config.c:322 common_config_load: Stir Shaken VS load failed
I suppose a verification section is required then, so I will get that written even though we don't use it, and see if that helps.
@gtjoseph so it looks like the module required this directory to exist:
/var/lib/asterisk/keys/stir_shaken/cache
And additionally adding:
[verification] global_disable = yes
Helped too, but then it's throwing an error about there being no [attestation] object even though there is one.
Helped too, but then it's throwing an error about there being no [attestation] object even though there is one.
Attach the actual log message and your stir_shaken.conf file.
The latest force push removed the internal vector that kept track of registered x509 extensions and instead uses the openssl OBJ_sn2nid and OBJ_nid2sn functions to determine if the extension is already registered. You can now safely unload res_stir_shaken and load it again without issues.
@gtjoseph see below:
874790*CLI> core set trace 5 res_stir_shaken.so
874790*CLI> core set debug 5 res_stir_shaken.so
Core trace was 0 and has been set to 5 for 'res_stir_shaken'.
Core debug was 0 and has been set to 5 for 'res_stir_shaken'.
874790*CLI> module load res_stir_shaken.so
Unable to load module res_stir_shaken.so
Command 'module load res_stir_shaken.so' failed.
[Feb 16 23:00:18] NOTICE[10391]: res_stir_shaken/crypto_utils.c:179 crypto_register_x509_extension: Registering TNAuthList NID 1248
[Feb 16 23:00:18] DEBUG[10391]: res_stir_shaken/common_config.c:313 common_config_load: Stir Shaken Load
[Feb 16 23:00:18] DEBUG[10391]: res_stir_shaken/verification_config.c:162 vs_check_common_config: verification: Checking common config
[Feb 16 23:00:18] DEBUG[10391]: res_stir_shaken/verification_config.c:264 vs_check_common_config: verification: Done
[Feb 16 23:00:18] ERROR[10391]: res_stir_shaken/attestation_config.c:52 as_is_config_loaded: No 'attestation' section could be found in stir/shaken configuration
[Feb 16 23:00:18] ERROR[10391]: res_stir_shaken/attestation.c:444 as_load: stir/shaken attestation service failed to load
[Feb 16 23:00:18] DEBUG[10391]: res_stir_shaken/common_config.c:327 common_config_load: Stir Shaken AS load failed
Contents of stir_shaken.conf:
[attestation]
global_disable = no
private_key_path = /etc/asterisk/stir_shaken/private_key.pem
public_cert_url = https://pvx1.s3.us-east-2.amazonaws.com/stirshaken/8448cc7eb8424d6ad5d2e6d71bcf6629.cer
attest_level = A
[attest_profile]
type = profile
endpoint_behavior = attest
[verification]
global_disable = yes
[19997379664]
type = tn
attest_level = A
[14693934979]
type = tn
attest_level = A
[19996528348]
type = tn
attest_level = A
[15732069259]
type = tn
attest_level = A
[19996699142]
type = tn
attest_level = A
[15732069268]
type = tn
attest_level = A
[19993056391]
type = tn
attest_level = A
[15732069280]
type = tn
attest_level = A
I realize those 999 numbers aren't real, I'll fix that in a bit.
Not 100% but you may be missing:
type = attestation
Edit: nope. I’m wrong.
Ahhh figured it out - I need to use private_key_file, not private_key_path... not sure where I got path from.
My last module error to fix is:
attest_profile: Neither this profile nor default verification options specify ca_file or ca_path
and I think it'll be fully functional.
@gtjoseph @seanbright BOOM! All good - calls are being attested properly.
Here's my functioning stir_shaken.conf:
[attestation]
global_disable = no
private_key_file = /etc/asterisk/stir_shaken/private_key.pem
public_cert_url = https://pvx1.s3.us-east-2.amazonaws.com/stirshaken/8448cc7eb8424d6ad5d2e6d71bcf6629.cer
attest_level = A
[attest_profile]
type = profile
endpoint_behavior = attest
ca_path = /etc/ssl/certs/
[verification]
global_disable = yes
[19997379664]
type = tn
attest_level = A
[14693934979]
type = tn
attest_level = A
I had to set ca_path = /etc/ssl/certs/.
@lukeescude That's good news! I'll make the config check just disable verification if there's no [verification] section in the config file. Same with [attestation]. This will also prevent you from having to specify a ca_* option in a profile.
BTW, I attempted to download the certificate revocation list from the iconnectiv URI specified in your certificate's "X509v3 CRL Distribution Points" extension and it failed. Do you know if it's supposed to work? Do you happen to have a contact at iconnectiv that could give us more info on it?
@lukeescude You don't happen to have the CA certificate or cert chain that was used to issue your certificate do you? If so, can you share it? We're sorely lacking in real-world examples to test with.
Changes in latest force push...
SHAKEN <spc>
and that <spc>
matches the SPC in the TNAuthList extension. Although we're now checking that there is a CN component to the subject, we're not sure if we really need to be that pedantic. I'll be asking the asterisk-dev list shortly.Hey @gtjoseph I'm not ignoring you, we're coming up on an RFP deadline to offer phone services to an entire school district so we're all-hands-on-deck 😄
I'll dig into my certs and get you everything I can possibly get.
Off the bat, the Iconective (Neustar) CA list is available on a JSON endpoint, but it's firewall-protected, so I have to use our VPN to get it:
<snipped>
I don't know how to read that, nor does it look like a CA thing to me...
I also have the original CSR, our private key, and something called the SPC Token Output if you'd like those.
Hey @lukeescude No worries!! Thanks for the info. I think we're good for now.
Your newest commits are working perfectly by the way.
Just FYI. There are no more changes coming other than to address review comments.
Successfully merged to branch master and cherry-picked to ["18","20","21"]
Why do we need a refactor?
The original stir/shaken implementation was started over 3 years ago when little was understood about practical implementation. The result was an implementation that wouldn't actually interoperate with any other stir-shaken implementations.
There were also a number of stir-shaken features and RFC requirements that were never implemented such as TNAuthList certificate validation, sending Reason headers in SIP responses when verification failed but we wished to continue the call, and the ability to send Media Key(mky) grants in the Identity header when the call involved DTLS.
Finally, there were some performance concerns around outgoing calls and selection of the correct certificate and private key. The configuration was keyed by an arbitrary name which meant that for every outgoing call, we had to scan the entire list of configured TNs to find the correct cert to use. With only a few TNs configured, this wasn't an issue but if you have a thousand, it could be.
What's changed?
Configuration objects have been refactored to be clearer about their uses and to fix issues.
We now use libjwt to create outgoing Identity headers and to parse and validate signatures on incoming Identiy headers. Our previous custom implementation was much of the source of the interoperability issues.
General code cleanup and refactor.
Resolves: #351 Resolves: #46
UserNote: Asterisk's stir-shaken feature has been refactored to correct interoperability, RFC compliance, and performance issues. See https://docs.asterisk.org/Deployment/STIR-SHAKEN for more information.
UpgradeNote: The stir-shaken refactor is a breaking change but since it's not working now we don't think it matters. The stir_shaken.conf file has changed significantly which means that existing ones WILL need to be changed. The stir_shaken.conf.sample file in configs/samples/ has quite a bit more information. This is also an ABI breaking change since some of the existing objects needed to be changed or removed, and new ones added. Additionally, if res_stir_shaken is enabled in menuselect, you'll need to either have the development package for libjwt v1.15.3 installed or use the --with-libjwt-bundled option with ./configure.