fonoster / routr

⚡ The future of programmable SIP servers.
https://routr.io
MIT License
1.42k stars 147 forks source link

Edgeport startup issues #238

Closed philjones88 closed 11 months ago

philjones88 commented 11 months ago

Describe the bug

Edgeport seems to have inconsistent startup issues.

Othertimes the secured ports (SIP SSL and WSS) don't open correctly and no certificates are sent.

Some examples:

[INFO  tini (1)] Spawned child process 'sh' with pid '7'
ca.crt file found. Creating a full chain of certificates...
No cert in -in file '/etc/routr/certs/fullchain.crt' matches private key
48EBFFFFFF7F0000:error:05800074:x509 certificate routines:X509_check_private_key:key values mismatch:crypto/x509/x509_cmp.c:408:
PKCS12 keystore has been created at /etc/routr/certs/signaling.p12
[INFO  tini (1)] Main child exited with signal (with signal 'Terminated')
[INFO  tini (1)] Spawned child process 'sh' with pid '7'
ca.crt file found. Creating a full chain of certificates...
No cert in -in file '/etc/routr/certs/fullchain.crt' matches private key
48EBFFFFFF7F0000:error:05800074:x509 certificate routines:X509_check_private_key:key values mismatch:crypto/x509/x509_cmp.c:408:
PKCS12 keystore has been created at /etc/routr/certs/signaling.p12
2023-11-14 10:34:31.603 [fatal]: (${sys:serviceName}) Launcher.java unable to run edgeport: The Peer SIP Stack: gov.nist.javax.sip.SipStackImpl could not be instantiated. Ensure the Path Name has been set.
[INFO  tini (1)] Main child exited normally (with status '1')
[INFO  tini (1)] Spawned child process 'sh' with pid '7'
ca.crt file found. Creating a full chain of certificates...
PKCS12 keystore has been created at /etc/routr/certs/signaling.p12
2023-11-13 20:14:51.710 [info]: (edgeport) GRPCSipListener.java starting edgeport ref = 3009e86eb79c at 0.0.0.0
2023-11-13 20:14:51.712 [info]: (edgeport) GRPCSipListener.java localnets list [127.0.0.1/8]
2023-11-13 20:14:51.712 [info]: (edgeport) GRPCSipListener.java external hosts list [10.10.60.2,local.lab]
SLF4J: No SLF4J providers were found.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See https://www.slf4j.org/codes.html#noProviders for further details.
2023-11-13 20:14:52.028 [info]: (edgeport) HealthCheck.java starting health check on port 8080 and endpoint /healthz
[INFO  tini (1)] Main child exited with signal (with signal 'Terminated')
[INFO  tini (1)] Spawned child process 'sh' with pid '7'
ca.crt file found. Creating a full chain of certificates...
PKCS12 keystore has been created at /etc/routr/certs/signaling.p12

To Reproduce

On an M2 Max MBP but setting docker platform to linux/amd64 to avoid Apple Silicon issues

docker-compose.yml

version: "3"

services:
  edgeport:
    container_name: edgeport
    image: fonoster/routr-edgeport:latest
    platform: linux/amd64
    environment:
      LOGS_LEVEL: verbose
    ports:
      - 5070:5060
      - 5070:5060/udp
      - 5071:5061
      - 5072:5062
      - 5073:5063
    volumes:
      - ./config/edgeport.yaml:/etc/routr/edgeport.yaml
      - ./certs/ca.crt:/etc/routr/certs/ca.crt
      - ./certs/server.crt:/etc/routr/certs/server.crt
      - ./certs/server.key:/etc/routr/certs/server.key
    networks:
      internal:

networks:
  internal:
    driver: bridge

edgeport.yaml

kind: EdgePort
apiVersion: v2beta1
ref: edgeport
metadata:
  region: default
spec:
  unknownMethodAction: Discard
  processor:
    addr: dispatcher:51901
  securityContext:
    client:
      protocols:
        - SSLv3
        - TLSv1.2
      authType: DisabledAll
    keyStorePassword: changeme
    trustStorePassword: changeme
    keyStore: "/etc/routr/certs/signaling.p12"
    trustStore: "/etc/routr/certs/signaling.p12"
    keyStoreType: pkcs12
  externalAddrs:
    - 10.10.60.2
    - local.lab
  localnets:
    - 127.0.0.1/8
  methods:
    - REGISTER
    - MESSAGE
    - INVITE
    - ACK
    - BYE
    - CANCEL
  transport:
    - protocol: tcp
      port: 5060
    - protocol: udp
      port: 5060
    - protocol: tls
      port: 5061
    - protocol: ws
      port: 5062
    - protocol: wss
      port: 5063

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

System information (please complete the following):

Additional context Add any other context about the problem here.

philjones88 commented 11 months ago

different one with just ca.crt containing the root CA certificate and not both root CA and intermediate certificates in a chain (intermediate first, ca second, following nginx ordering)

[INFO  tini (1)] Spawned child process 'sh' with pid '7'
ca.crt file found. Creating a full chain of certificates...
PKCS12 keystore has been created at /etc/routr/certs/signaling.p12
2023-11-14 17:14:00.307 [info]: (edgeport) GRPCSipListener.java starting edgeport ref = 0dd4c0717007 at 0.0.0.0
2023-11-14 17:14:00.310 [info]: (edgeport) GRPCSipListener.java localnets list [127.0.0.1/8]
2023-11-14 17:14:00.310 [info]: (edgeport) GRPCSipListener.java external hosts list [10.10.60.2,local.vq.lab]
SLF4J: No SLF4J providers were found.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See https://www.slf4j.org/codes.html#noProviders for further details.
2023-11-14 17:14:00.574 [info]: (edgeport) HealthCheck.java starting health check on port 8080 and endpoint /healthz
[thread 61 also had an error]
[thread 59 also had an error]
[thread 60 also had an error]
[thread 34 also had an error]
[thread 62 also had an error]
[thread 75 also had an error]
[thread 16 also had an error]
[thread 63 also had an error]
[thread 33 also had an error]
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fffff425d73, pid=14, tid=76
#
# JRE version: OpenJDK Runtime Environment Temurin-11.0.21+9 (11.0.21+9) (build 11.0.21+9)
# Java VM: OpenJDK 64-Bit Server VM Temurin-11.0.21+9 (11.0.21+9, mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x7ced73]  G1ParScanThreadState::trim_queue()+0x353
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /tmp/hs_err_pid14.log
#
# If you would like to submit a bug report, please visit:
#   https://github.com/adoptium/adoptium-support/issues
#
Aborted
[INFO  tini (1)] Main child exited normally (with status '134')
psanders commented 11 months ago

Hey @philjones88,

Let's try to figure this one out.

First, a few things to note. Unlike most services in Routr, the EdgePort is Java-based and, currently, I'm only publishing the linux/amd64 version. You've been adding the platform field, but it feels that that's just suppressing the warnings from Docker rather than solving the root issue.

I've noticed that the base image for EdgePort, eclipse-temurin, supports linux/arm64/v8. I'm planning to test this and try to add it as a supported platform. I'll let you know how it goes.

With that said, for some reason, I can use EdgePort without any issues, which makes me wonder if there might be something in my setup that's not in yours.

Are you by chance using Rosetta? Enabling Rosetta is definitely giving me some issues.

Can you try the following and let me know how it goes?

First, create a new directory, navigate into it, then create a subdirectory named certs.

Then, generate a set of self-signed certificates using these commands:

openssl genpkey -algorithm RSA -out ca.key
openssl req -new -x509 -key ca.key -sha256 -days 365 -out ca.crt
openssl genpkey -algorithm RSA -out server.key
openssl req -new -key server.key -out server.csr
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out server.crt -days 365 -sha256

Next, create a config folder and add an edgeport.yaml file with the information bellow. Ensure you adjust the externalAddrs to your Docker host IP.

kind: EdgePort
apiVersion: v2beta1
ref: edgeport-01
metadata:
  region: default
spec:
  unknownMethodAction: Discard
  processor:
    addr: echo:51904  
  securityContext:
    client:
      protocols:
        - SSLv3
        - TLSv1.2
      authType: DisabledAll
    keyStorePassword: changeme
    trustStorePassword: changeme
    keyStore: "/etc/routr/certs/signaling.p12"
    trustStore: "/etc/routr/certs/signaling.p12"
    keyStoreType: pkcs12
  externalAddrs:
    - 192.168.1.7
  methods:
    - REGISTER
    - MESSAGE
    - INVITE
    - ACK
    - BYE
    - CANCEL
  transport:
    - protocol: tcp
      port: 5060
    - protocol: udp
      port: 5060
    - protocol: tls
      port: 5061
    - protocol: ws 
      port: 5062
    - protocol: wss
      port: 5063

Then, create a compose.yaml file with this content:

version: "3"

services:
  edgeport:
    container_name: edgeport
    image: fonoster/routr-edgeport:latest
    environment:
      LOGS_LEVEL: verbose
    ports:
      - 5060:5060
      - 5060:5060/udp
      - 5061:5061
      - 5062:5062
      - 5063:5063
    volumes:
      - ./config/edgeport.yaml:/etc/routr/edgeport.yaml
      - ./certs/ca.crt:/etc/routr/certs/ca.crt
      - ./certs/server.crt:/etc/routr/certs/server.crt
      - ./certs/server.key:/etc/routr/certs/server.key

  echo:
    container_name: echo
    image: fonoster/routr-echo:latest
    environment:
      LOGS_LEVEL: verbose
    expose:
      - 51904

Finally, run docker compose up

Are you able to connect using TCP, UDP, and TLS with a regular softphone like Blink Pro?

If it works, can you try with your own certificates? Does it behave any different?

psanders commented 11 months ago

Just checked, and it appears that publishing an image based on ARM64 architecture will require substantial effort. Temurin does not provide any Alpine image with ARM64 support. This means we would need to refactor the image to use a different base, which would take more time than I can allocate for this task.

psanders commented 11 months ago

Correction on my last message. Let me know if this release helps you https://github.com/fonoster/routr/releases/tag/v2.6.0

philjones88 commented 11 months ago

Fixed by ARM based issues