jitsi / docker-jitsi-meet

Jitsi Meet on Docker
https://hub.docker.com/u/jitsi/
Apache License 2.0
3k stars 1.34k forks source link

Vosk Configuration Support #1253

Open radove opened 2 years ago

radove commented 2 years ago

It would be nice to have better Vosk support built into the Docker image. In order to get Vosk support to work, I had to take the google required logic out of the file: jigasi/rootfs/etc/cont-init.d/10-config and modify the sip-communicator.properties template to include:

org.jitsi.jigasi.transcription.customService=org.jitsi.jigasi.transcription.VoskTranscriptionService
org.jitsi.jigasi.transcription.vosk.websocket_url=ws://172.31.52.35:2700

Rebuilt image and used that

saghul commented 2 years ago

A PR adding some appropriate env variables would be welcome.

janonym1 commented 2 years ago

Since not many people will be using the VOSK service, wouldnt a restart shell script be enough? I am managing most settings not covered by the .env file that way. For example:

#!/bin/bash
# Set JVB Logging from INFO to WARNING
sed -i 's/^.level=.*/.level=INFO/' .jitsi-meet-cfg/jvb/logging.properties
docker restart jitsi-meet-jvb-1

Also, I couldnt get VOSK to work with the sip-communicator.properties file since it always fell back trying to use the GoogleAPI (even when removing the google stuff at jigasi/rootfs/etc/cont-init.d/10-config) so I needed to use the custom-sip-communicator.properties instead

Maybe something like

#!/bin/bash

touch .jitsi-meet-cfg/jigasi/custom-sip-communicator.properties
echo "org.jitsi.jigasi.transcription.customService=org.jitsi.jigasi.transcription.VoskTranscriptionService" >> .jitsi-meet-cfg/jigasi/custom-sip-communicator.properties
echo "org.jitsi.jigasi.transcription.vosk.websocket_url=ws://VOSKSERVER:2700" >> .jitsi-meet-cfg/jigasi/custom-sip-communicator.properties
docker restart jitsi-meet-jigasi-1

However, I still needed to fill out the GoogleAPI fields in .env even if it didnt get used in the end

saghul commented 2 years ago

I'd take a PR which allows setting the appropriate env variables, if you're up for it

janonym1 commented 2 years ago

Sure thing! However I am not sure how to best approach this and how to get started. With the latest version (stable-7439-2) I have the problem, that my previous workaround does not work anymore. I previously just cut out the part with the google credentials in the dockerfile (/config/key.json) and delted some google related code at https://github.com/jitsi/docker-jitsi-meet/blob/master/jigasi/rootfs/etc/cont-init.d/10-config#L26

Even with those changes, I still get the following errors when running jigasi:

[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] 01-set-timezone: executing... 
[cont-init.d] 01-set-timezone: exited 0.
[cont-init.d] 10-config: executing... 
Transcriptions: One or more environment variables are undefined
[cont-init.d] 10-config: exited 1.
[cont-finish.d] executing container finish scripts...
[cont-finish.d] done.
[s6-finish] waiting for services.
[s6-finish] sending all processes the TERM signal.
[s6-finish] sending all processes the KILL signal and exiting.

I dont get, why it still says "Transcriptions: One or more environment variables are undefined" even when I already cut that part out.

In general, it should be straightforward to setup VOSK for jitsi, since you only need some few relevant lines within the .jitsi-meet-cfg/jigasi/custom-sip-communicator.properties file:

org.jitsi.jigasi.transcription.customService=org.jitsi.jigasi.transcription.VoskTranscriptionService
echo org.jitsi.jigasi.transcription.vosk.websocket_url=ws://VOSKSERVER:2700

However, since a lot of the Google Cloud stuff seems hardcoded, I dont know where else to look for

saghul commented 2 years ago

Here is where that check takes place: https://github.com/jitsi/docker-jitsi-meet/blob/be8c41f79eabcacf03047dfec2f2998077b701aa/jigasi/rootfs/etc/cont-init.d/10-config#L29

Obviously we should not check for those if vosk is used.

janonym1 commented 2 years ago

I thought, if I just delete the whole google cloud part from the file, the check shouldnt run anyways? I deleted the whole part out of the 10-config file:

# Create Google Cloud Credentials
if [[ $ENABLE_TRANSCRIPTIONS -eq 1 || $ENABLE_TRANSCRIPTIONS == "true" ]]; then
    if [[ -z $GC_PROJECT_ID || -z $GC_PRIVATE_KEY_ID || -z $GC_PRIVATE_KEY || -z $GC_CLIENT_EMAIL || -z $GC_CLIENT_ID || -z $GC_CLIENT_CERT_URL ]]; then
        echo 'Transcriptions: One or more environment variables are undefined'
        exit 1
    fi

    jq -n \
        --arg GC_PROJECT_ID "$GC_PROJECT_ID" \
        --arg GC_PRIVATE_KEY_ID "$GC_PRIVATE_KEY_ID" \
        --arg GC_PRIVATE_KEY "$GC_PRIVATE_KEY" \
        --arg GC_CLIENT_EMAIL "$GC_CLIENT_EMAIL" \
        --arg GC_CLIENT_ID "$GC_CLIENT_ID" \
        --arg GC_CLIENT_CERT_URL "$GC_CLIENT_CERT_URL" \
        '{
            type: "service_account",
            project_id: $GC_PROJECT_ID,
            private_key_id: $GC_PRIVATE_KEY_ID,
            private_key: $GC_PRIVATE_KEY,
            client_email: $GC_CLIENT_EMAIL,
            client_id: $GC_CLIENT_ID,
            auth_uri: "https://accounts.google.com/o/oauth2/auth",
            token_uri: "https://oauth2.googleapis.com/token",
            auth_provider_x509_cert_url: "https://www.googleapis.com/oauth2/v1/certs",
            client_x509_cert_url: $GC_CLIENT_CERT_URL
        }' \
        > /config/key.json
fi

but I still got the check error message. I am not sure where to start If I cant even disable the google check :)

I am thinking something along the lines of making a variable to enable VOSK, one for the VOSK server address, and a switch for the glcoud, which then sets the jitsi-meet-cfg/jigasi/custom-sip-communicator.properties

if [[ $ENABLE_TRANSCRIPTIONS -eq 1 || $ENABLE_TRANSCRIPTIONS == "true" ] && [$ENABLE_GCLOUD -eq 0 || $ENABLE_GCLOUD == "false"]]; then
   if [[ $ENABLE_VOSK -eq 1 || $ENABLE_VOSK == "true" ]]; then
       if [[ -z $VOSK_SERVER ]]; then
             echo 'Vosk Server not set!'
             exit1
        fi
        touch .jitsi-meet-cfg/jigasi/custom-sip-communicator.properties
        echo "org.jitsi.jigasi.transcription.customService=org.jitsi.jigasi.transcription.VoskTranscriptionService" >> .jitsi-meet-cfg/jigasi/custom-sip-communicator.properties
        echo "org.jitsi.jigasi.transcription.vosk.websocket_url=ws://$VOSK_SERVER:2700" >> .jitsi-meet-cfg/jigasi/custom-sip-communicator.properties
    fi
fi
saghul commented 2 years ago

but I still got the check error message. I am not sure where to start If I cant even disable the google check :)

Did you rebuild the image and recreate the container? Otherwise you'll still be running the old code.

I am thinking something along the lines of making a variable to enable VOSK, one for the VOSK server address, and a switch for the glcoud, which then sets the jitsi-meet-cfg/jigasi/custom-sip-communicator.properties

That sounds good!

janonym1 commented 2 years ago

Ah of course, I forgot that! I tried to just build jigasi (docker build - < Dockerfile) from inside the docker-jitsi-meet/jigasi folder, where the Dockerfile is located, but I get:

Step 10/11 : COPY rootfs/ /
COPY failed: file not found in build context or excluded by .dockerignore: stat rootfs/: file does not exist

I thought the location for the folders to be copied is the same as the Dockerfile itself. I assume I need some context settings? Or do I need to use the makefile to rebuild everything?

saghul commented 2 years ago

You can do make build_jigasi from the root project folder.

janonym1 commented 2 years ago

Thanks, that make build command was what I was looking for! I am used to docker but only used it as a sys-admin :) I am working on it and will make a PR after some testing and when its working as intended

saghul commented 2 years ago

Excellent!

janonym1 commented 2 years ago

I am trying to get the thing running without GCLOUD but it seems pretty hardcoded. Even after removing everything related to it from thejigasi/rootfs/etc/cont-init.d/10-config and even the Dockerfile (relating to the ENV GOOGLE_APPLICATION_CREDENTIALS /config/key.json) I always get the following error when adding subtiles in a conference (jigasi docker logs):

SEVERE: testing123/focus: error creating stream observer
java.io.IOException: Error reading credential file from environment variable GOOGLE_APPLICATION_CREDENTIALS, value '/config/key.json': File does not exist.

Okay, so I need to setup a "fake" key.json with fake GCLOUD values, which is what I did in my previous workaround (see first my post here). So I put back the ENV GOOGLE_APPLICATION_CREDENTIALS /config/key.json into the Dockerfile, gave the GLCOUD variables inside the .env file some fake entries (like using the string "test" everywhere) and added the GCLOUD stuff back into the rootfs/etc/cont-init.d/10-config, so it creates the key.json. But with the latest versions (it worked previously but not anymore), it seems to check the key.json for validitiy:

WARNING: Google Credentials are not properly set
java.io.IOException: Error reading credential file from environment variable GOOGLE_APPLICATION_CREDENTIALS, value '/config/key.json': Invalid PKCS#8 data.

I am not sure, where all the GCLOUD stuff is coming from, I assume it is the jigasi.sh script at jigasi/rootfs/etc/services.d/jigasi/run? (Line 5: DAEMON=/usr/share/jigasi/jigasi.sh)

Looking at the usptream jigasi.sh at https://github.com/jitsi/jigasi/blob/master/jigasi.sh, I am not seeing any google related stuff. Where exactly does it come from?

saghul commented 2 years ago

I think it might be the google SDK reading that file. That env var should not be set unless google cloud is enabled.

janonym1 commented 2 years ago

I think I figured my problem out: even though this part (in the 10-config) appends the custom-sip-communicator.properties to the sip-communicator.properties:

if [[ -f /config/custom-sip-communicator.properties ]]; then
    cat /config/custom-sip-communicator.properties >> /config/sip-communicator.properties
fi

However, I still get the GCLOUD error messages because it seems that I need to copy over and create the custom-sip-communicator.properties file to make it work correctly (with the VOSK settings defined in the jigasi/rootfs/etc/cont-init.d/custom-sip-communicator.properties, which were hardcoded by me for now). Therefor, I just added tpl /defaults/custom-sip-communicator.properties > /config/custom-sip-communicator.properties to the 10-config file and that seems to work!

I dont know, why he needs the custom file specifically, but I propose some simple variable checks, which effectively then set the following in the custom-sip-communicator.properties:

org.jitsi.jigasi.transcription.customService=org.jitsi.jigasi.transcription.VoskTranscriptionService
org.jitsi.jigasi.transcription.vosk.websocket_url=ws://VOSKSERVER:2700
org.jitsi.jigasi.ENABLE_TRANSCRIPTION=true

Would that be okay?

saghul commented 2 years ago

That would need to be part of the main file, not the custom file. Also, this line probably needs to go: https://github.com/jitsi/docker-jitsi-meet/blob/d804ba48c9601a6942e1e61af1b0ec6264003388/jigasi/Dockerfile#L11 and only be set in the runscript, if the file was generated by the config script.

janonym1 commented 2 years ago

You are right, I only need the sip-commuicator.properties set up correctly for it to work as well. My bad!

My approach would be something like: modifying the 10-config to keep the check against empty GLCOUD creds albeit when VOSK is disabled:

# Create Google Cloud Credentials
if [[ $ENABLE_TRANSCRIPTIONS -eq 1 || $ENABLE_TRANSCRIPTIONS == "true" ]]; then
    if [[ -z $GC_PROJECT_ID || -z $GC_PRIVATE_KEY_ID || -z $GC_PRIVATE_KEY || -z $GC_CLIENT_EMAIL || -z $GC_CLIENT_ID || -z $GC_CLIENT_CERT_URL ]]; then
       if [[ ($ENABLE_VOSK -eq 0 || $ENABLE_VOSK == "false") ]]; then
          echo 'Transcriptions: One or more environment variables are undefined'
          exit 1
       fi
    fi

    jq -n \
        --arg GC_PROJECT_ID "$GC_PROJECT_ID" \
        --arg GC_PRIVATE_KEY_ID "$GC_PRIVATE_KEY_ID" \
        --arg GC_PRIVATE_KEY "$GC_PRIVATE_KEY" \
        --arg GC_CLIENT_EMAIL "$GC_CLIENT_EMAIL" \
        --arg GC_CLIENT_ID "$GC_CLIENT_ID" \
        --arg GC_CLIENT_CERT_URL "$GC_CLIENT_CERT_URL" \
        '{
            type: "service_account",
            project_id: $GC_PROJECT_ID,
            private_key_id: $GC_PRIVATE_KEY_ID,
            private_key: $GC_PRIVATE_KEY,
            client_email: $GC_CLIENT_EMAIL,
            client_id: $GC_CLIENT_ID,
            auth_uri: "https://accounts.google.com/o/oauth2/auth",
            token_uri: "https://oauth2.googleapis.com/token",
            auth_provider_x509_cert_url: "https://www.googleapis.com/oauth2/v1/certs",
            client_x509_cert_url: $GC_CLIENT_CERT_URL
        }' \
        > /config/key.json
fi

adding the VOSK related variables to the defaults/sip-communicator.properties:

{{ $ENABLE_VOSK := .Env.ENABLE_VOSK | default "0" | toBool  -}}
{{ $VOSK_SERVER := .Env.VOSK_SERVER}}
[...]

{{ if $ENABLE_VOSK }}
#VOSK transcription config
org.jitsi.jigasi.transcription.customService=org.jitsi.jigasi.transcription.VoskTranscriptionService
org.jitsi.jigasi.transcription.vosk.websocket_url=ws://{{ .Env.VOSK_SERVER }}
{{ end }}

and adding the variable name (VOSK_SERVER and ENABLE_VOSK) to jigasi.yml and maybe the .env (but most people wont need this setting, so I think we can leave it out by default).

Since it is assumed that GCLOUD is the default anyways, we may need a ENABLE_GCLOUD_TRANSCRIPT variable? Then the run file could check it and if it is enabled, only then set the ENV GOOGLE_APPLICATION_CREDENTIALS /config/key.json.

saghul commented 2 years ago

Since it is assumed that GCLOUD is the default anyways, we may need a ENABLE_GCLOUD_TRANSCRIPT variable?

I think we can infer it, it the other gcloud variables are set, can't we?

The rest sounds good to me! How about you open a PR and we can discuss any details there?

janonym1 commented 2 years ago

done: https://github.com/jitsi/docker-jitsi-meet/pull/1343