k2-fsa / sherpa-onnx

Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter
https://k2-fsa.github.io/sherpa/onnx/index.html
Apache License 2.0
2.62k stars 297 forks source link

Build error on MacOS 14.5 with go-api-example/real-time-speech-recognition-from-microphone #1097

Open iAInNet opened 2 weeks ago

iAInNet commented 2 weeks ago

gcc -v

Apple clang version 15.0.0 (clang-1500.0.40.1)
Target: x86_64-apple-darwin23.5.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

go env

GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/xxxx/Library/Caches/go-build"
GOENV="/Users/xxx/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GONOPROXY="*.xxx.com"
GONOSUMDB="*.xxx.com"
GOOS="darwin"
GOPATH="/Users/xxx/go"
GOPROXY="https://goproxy.cn"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GOVCS=""
GOVERSION="go1.17.13"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/xxxx/repositories/sherpa-onnx/go-api-examples/real-time-speech-recognition-from-microphone/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/g9/1q2wk6xs11lf96qjwdp7n0yh0000gp/T/go-build3146455420=/tmp/go-build -gno-record-gcc-switches -fno-common"

pkg-config --cflags --libs portaudio-2.0

-I/usr/local/Cellar/portaudio/19.7.0/include -L/usr/local/Cellar/portaudio/19.7.0/lib -lportaudio -framework CoreAudio -framework AudioToolbox -framework AudioUnit -framework CoreFoundation -framework CoreServices

BUT, when I run go build or GOOS=darwin GOARCH=amd64 go build. The error message is as follows:

# github.com/csukuangfj/portaudio-go
ld: library 'sherpa-onnx-portaudio' not found
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Could someone tell me what I'm missing?

csukuangfj commented 2 weeks ago

are you using the latest package?

csukuangfj commented 2 weeks ago

please show.your complete logs which contain the version about sherpa-onnx-go-macos.

iAInNet commented 2 weeks ago

are you using the latest package?

Yes, I pulled the latest code from the master branch. I didn't switch to another branch.

please show.your complete logs which contain the version about sherpa-onnx-go-macos.

go: finding module for package github.com/csukuangfj/portaudio-go go: finding module for package github.com/spf13/pflag go: finding module for package github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx go: found github.com/csukuangfj/portaudio-go in github.com/csukuangfj/portaudio-go v1.0.4 go: found github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx in github.com/k2-fsa/sherpa-onnx-go v1.8.7 go: found github.com/spf13/pflag in github.com/spf13/pflag v1.0.5 go: finding module for package github.com/k2-fsa/sherpa-onnx-go-linux go: finding module for package github.com/k2-fsa/sherpa-onnx-go-macos go: finding module for package github.com/k2-fsa/sherpa-onnx-go-windows go: found github.com/k2-fsa/sherpa-onnx-go-linux in github.com/k2-fsa/sherpa-onnx-go-linux v1.10.11 go: found github.com/k2-fsa/sherpa-onnx-go-macos in github.com/k2-fsa/sherpa-onnx-go-macos v1.10.10 go: found github.com/k2-fsa/sherpa-onnx-go-windows in github.com/k2-fsa/sherpa-onnx-go-windows v1.10.10

As show above, I think the sherpa-onnx-go-macos is v1.10.10

csukuangfj commented 2 weeks ago
Screenshot 2024-07-09 at 14 30 44

I am also using macOS amd64 but I cannot produce your issue.

csukuangfj commented 2 weeks ago

I have checked portaudio-go https://github.com/csukuangfj/portaudio-go/blob/e90d7285b794fb9359689370167ad9eb15633285/portaudio.go#L15

/*
#cgo !windows pkg-config: portaudio-2.0
#cgo windows CFLAGS: -I${SRCDIR}
#cgo windows !386 LDFLAGS: -L ${SRCDIR}/x86_64 -lsherpa-onnx-portaudio
#cgo windows !amd64 LDFLAGS: -L ${SRCDIR}/386 -lsherpa-onnx-portaudio
#include "portaudio.h"
extern PaStreamCallback* paStreamCallback;
*/

You can see that only windows will require -lsherpa-onnx-portaudio

iAInNet commented 2 weeks ago

I have checked portaudio-go https://github.com/csukuangfj/portaudio-go/blob/e90d7285b794fb9359689370167ad9eb15633285/portaudio.go#L15

/*
#cgo !windows pkg-config: portaudio-2.0
#cgo windows CFLAGS: -I${SRCDIR}
#cgo windows !386 LDFLAGS: -L ${SRCDIR}/x86_64 -lsherpa-onnx-portaudio
#cgo windows !amd64 LDFLAGS: -L ${SRCDIR}/386 -lsherpa-onnx-portaudio
#include "portaudio.h"
extern PaStreamCallback* paStreamCallback;
*/

You can see that only windows will require -lsherpa-onnx-portaudio

As you say, it should build with cgo !windows pkg-config: portaudio-2.0, BUT on my machine it always fall into -lsherpa-onnx-portaudio. So, I commented the following three lines.

#cgo windows CFLAGS: -I${SRCDIR}
#cgo windows !386 LDFLAGS: -L ${SRCDIR}/x86_64 -lsherpa-onnx-portaudio
#cgo windows !amd64 LDFLAGS: -L ${SRCDIR}/386 -lsherpa-onnx-portaudio

Then, it build success.

When I run the command:

./real-time-speech-recognition-from-microphone \
  --encoder ./icefall-asr-zipformer-streaming-wenetspeech-20230615/exp/encoder-epoch-12-avg-4-chunk-16-left-128.onnx \
  --decoder ./icefall-asr-zipformer-streaming-wenetspeech-20230615/exp/decoder-epoch-12-avg-4-chunk-16-left-128.onnx \
  --joiner ./icefall-asr-zipformer-streaming-wenetspeech-20230615/exp/joiner-epoch-12-avg-4-chunk-16-left-128.onnx \
  --tokens ./icefall-asr-zipformer-streaming-wenetspeech-20230615/data/lang_char/tokens.txt \
  --model-type zipformer2

I got new error: https://github.com/k2-fsa/sherpa-onnx/blob/3e4307e2fb88d4b1b648211c14f2fff6db11bca4/go-api-examples/real-time-speech-recognition-from-microphone/main.go#L81

panic: Input overflowed

There may be a connection between these two issues.

csukuangfj commented 2 weeks ago

Could you add some logs to check how many times

 chk(s.Read()) 

is executed? Does it fail at the first run of the for loop?

iAInNet commented 2 weeks ago

is executed?

Yes, Recognizer created! Started! Please speak

Does it fail at the first run of the for loop?

No, about six times of loop. Then, it panic.

my audio input:

48K, 1 ch 16-bit integer

I modified the example code param.SampleRate = 16000 to param.SampleRate = 48000, still not working. It report same error after six times of loop.

csukuangfj commented 2 weeks ago

Do you use the builtin microphone or use a USB microphone?

iAInNet commented 2 weeks ago

Do you use the builtin microphone or use a USB microphone?

a USB microphone.

I switch to builtin microphone. audio input: 48K, 1 channel 32-bit float. It worked. When I use USB microphone, maybe I have to normalize the input samples to [-1, 1]. I will take a try.

Now, only one question remains. I have no clue why I have to comment out the following three lines when I run go build?

#cgo windows CFLAGS: -I${SRCDIR}
#cgo windows !386 LDFLAGS: -L ${SRCDIR}/x86_64 -lsherpa-onnx-portaudio
#cgo windows !amd64 LDFLAGS: -L ${SRCDIR}/386 -lsherpa-onnx-portaudio
iAInNet commented 2 weeks ago

After 1 minute and half, the program still panic, same error at line 81 chk(s.Read())

panic: Input overflowed

it outputs some segment, then panic.

0:  its a great its um a little bit no king i um and its a being a long process to get here
1: lifting you know chair out of the way broken glass alice senatory items i dont want to get into but a it was horrified to say the list
2:  suppylo customs as usual a big list ear especially from recent movies like thor and the green london
3:  set louis take load series open on the cool night his was forty nine degrees at the start off the game
4:  the project to put them online is expected to be compilated by
5:  twenty one sixteen
6:  family leftware verm by boat bound by thailand basile bows was intercepted by parrot after ears of trying the father located his son in the talent

panic: Input overflowed

goroutine 1 [running]:
main.chk(...)
    /Users/xxxx/repositories/sherpa-onnx/go-api-examples/real-time-speech-recognition-from-microphone/main.go:108
main.main()
    /Users/xxxx/repositories/sherpa-onnx/go-api-examples/real-time-speech-recognition-from-microphone/main.go:81 +0xb10
csukuangfj commented 2 weeks ago

What is the RTF of the current model on your Mac?