microsoft / psi

Platform for Situated Intelligence
https://github.com/microsoft/psi/wiki
Other
542 stars 97 forks source link

SIGMA initialization is stuck at 80% for more than 30 minutes #335

Open Fenierxu opened 1 week ago

Fenierxu commented 1 week ago

Hi, I have encountered some problems when run SIGMA. When I configure and run SIGMA according to “How to install, configure and run SIGMA”, I found that the server displays“RENDEZVOUS ERROR: Cannot find a recognizer with the required ID” in the terminal , at the same time, the initialization of the SIGMA APP on my Hololens2 has been stuck at 80% for more than 30 minutes.

Specifically, after I got Azure Speech Services, I set the appropriate environment variables based on my key and region, and configured a text file called “CognitiveServicesSpeechKey. txt”. I use the local anchors because I found that the Azure Spatial Anchors are no longer able to be registered. When I open SIGMA, I click START, then two options about anchors appear, no matter which one I click on, the initialization process ends up being stuck at 80%. The corresponding pictures are shown below, I would like to know what causes these phenomena.

Thank you for your time and consideration. I'm looking forward to your reply and any response is appreciated!

Image Image Image

danbohus commented 1 week ago

It sounds like the server is encountering an exception on start-up. The error "Cannot find a recognizer with the required ID" is not generated by the Sigma source-code per se, but probably by one of the underlying components ... Can you provide more details from the console output on the server terminal? Below the "Cannot find a recognizer ..." there should be a full stack trace that might give us more clues about what the source of the problem is.

Fenierxu commented 1 week ago

Thanks Dan, When I start SIGMA on Hololens2 and run SigmaComputeServer project on VS 2022, the server terminal shows the following message:

======================================================== STARTING SIGMA COMPUTE SERVER @ 11/14/2024 17:14:37.4195

Using Native MKL. Math.NET Numerics Configuration: Version 4.9.1 Built for .Net Framework 4.6.1 Linear Algebra Provider: Intel MKL (x64; revision 14; ahead revision 12; MKL 2020.0 Update 4) Fourier Transform Provider: Intel MKL (x64; revision 14; ahead revision 12; MKL 2020.0 Update 4) Max Degree of Parallelism: 6 Parallelize Elements: 300 Parallelize Order: 64 Check Distribution Parameters: True Thread-Safe RNGs: True Operating System: Microsoft Windows NT 6.2.9200.0 Framework: 4.0.30319.42000

Available configurations: Diamond [11/14/2024 17:14:37.4725]: Listening on TCP port 13331 for client app (Sigma). [11/14/2024 17:14:37.4735]: Be sure to check firewall settings (may need to enable Public). [11/14/2024 17:14:37.4735]: Press any key to exit. [11/14/2024 17:15:26.8844]: [11/14/2024 17:15:26.8854]: PROCESS ADDED: Diamond [11/14/2024 17:15:26.8854]: ENDPOINT: Remote Clock 192.168.1.105 11511 [11/14/2024 17:15:26.8854]: ENDPOINT: TCP 192.168.1.105 15000 [11/14/2024 17:15:26.8864]: STREAM: HoloLensStreams.Audio [11/14/2024 17:15:26.8864]: ENDPOINT: TCP 192.168.1.105 15001 [11/14/2024 17:15:26.8864]: STREAM: HoloLensStreams.VideoEncodedImageCameraView [11/14/2024 17:15:26.8874]: ENDPOINT: TCP 192.168.1.105 15002 [11/14/2024 17:15:26.8874]: STREAM: HoloLensStreams.PreviewEncodedImageCameraView [11/14/2024 17:15:26.8874]: ENDPOINT: TCP 192.168.1.105 15003 [11/14/2024 17:15:26.8884]: STREAM: HoloLensStreams.DepthImageCameraView [11/14/2024 17:15:26.8884]: ENDPOINT: TCP 192.168.1.105 15004 [11/14/2024 17:15:26.8884]: STREAM: HoloLensStreams.WorldSpatialAnchorId [11/14/2024 17:15:26.8884]: ENDPOINT: TCP 192.168.1.105 15005 [11/14/2024 17:15:26.8894]: STREAM: HoloLensStreams.PipelineDiagnostics [11/14/2024 17:15:26.8894]: ENDPOINT: TCP 192.168.1.105 17000 [11/14/2024 17:15:26.8894]: STREAM: UserInterfaceStreams.UserInterfaceState [11/14/2024 17:15:26.8894]: ENDPOINT: TCP 192.168.1.105 17001 [11/14/2024 17:15:26.8894]: STREAM: UserInterfaceStreams.EyesAndHead [11/14/2024 17:15:26.8904]: ENDPOINT: TCP 192.168.1.105 17002 [11/14/2024 17:15:26.8904]: STREAM: UserInterfaceStreams.Hands [11/14/2024 17:15:26.8904]: ENDPOINT: TCP 192.168.1.105 17003 [11/14/2024 17:15:26.8904]: STREAM: UserInterfaceStreams.SystemAudio [11/14/2024 17:15:26.8914]: ENDPOINT: TCP 192.168.1.105 17004 [11/14/2024 17:15:26.8914]: STREAM: UserInterfaceStreams.SpeechSynthesisProgress [11/14/2024 17:15:26.8924]: ENDPOINT: TCP 192.168.1.105 17005 [11/14/2024 17:15:26.8924]: STREAM: UserInterfaceStreams.DebugInfo [11/14/2024 17:15:26.8934]: Starting Pipeline Diamond [11/14/2024 17:15:27.0235]: Connecting to clock sync ...DONE. [11/14/2024 17:15:27.5254]: Creating compute server pipeline, exporting to 20241114-171526. [11/14/2024 17:15:27.8827]: [11/14/2024 17:15:27.9227]: RENDEZVOUS ERROR: Cannot find a recognizer with the required ID. Parameter name: culture at System.Speech.Recognition.SpeechRecognitionEngine..ctor(CultureInfo culture) at Microsoft.Psi.Speech.SystemSpeech.CreateSpeechRecognitionEngine(String language, GrammarInfo[] grammars) in D:\xyf_software\psi\psi-master\Sources\Speech\Microsoft.Psi.Speech.Windows\SystemSpeech.cs:line 63 at Microsoft.Psi.Speech.SystemVoiceActivityDetector.CreateSpeechRecognitionEngine() in D:\xyf_software\psi\psi-master\Sources\Speech\Microsoft.Psi.Speech.Windows\SystemVoiceActivityDetector.cs:line 197 at Microsoft.Psi.Speech.SystemVoiceActivityDetector..ctor(Pipeline pipeline, SystemVoiceActivityDetectorConfiguration configuration, String name) in D:\xyf_software\psi\psi-master\Sources\Speech\Microsoft.Psi.Speech.Windows\SystemVoiceActivityDetector.cs:line 94 at Microsoft.Psi.Speech.Resources.<>c.b__0_6(Pipeline p) in D:\xyf_software\psi\psi-master\Sources\Speech\Microsoft.Psi.Speech.Windows\Resources.cs:line 28 at Microsoft.Psi.MixedReality.Applications.SpeechRecognitionPipeline..ctor(Pipeline pipeline, SpeechRecognitionPipelineConfiguration configuration, String name) in D:\xyf_software\psi\psi-master\Applications\Microsoft.Psi.MixedReality.Applications\SpeechRecognitionPipeline.cs:line 49 at Sigma.SigmaComputeServerPipeline`8.Initialize() in D:\xyf_software\psi\psi-master\Applications\Sigma\Sigma\SigmaComputeServerPipeline.cs:line 192 at Sigma.SigmaLiveComputeServer.CreateComputeServerPipeline(SigmaComputeServerPipelineConfiguration configuration, HoloLensStreams hololensStreams, Process inputRendezvousProcess, Process outputRendezvousProcess, Exporter exporter) in D:\xyf_software\psi\psi-master\Applications\Sigma\Sigma\SigmaComputeServer\SigmaLiveComputeServer.cs:line 48

at Microsoft.Psi.MixedReality.Applications.LiveComputeServer`1.CreateAndRunComputeServerPipeline(Process inputRendezvousProcess) in D:\xyf_software\psi\psi-master\Applications\Microsoft.Psi.MixedReality.Applications\LiveComputeServer.cs:line 229

at Microsoft.Psi.MixedReality.Applications.LiveComputeServer`1.b__150(Object , Process process) in D:\xyf_software\psi\psi-master\Applications\Microsoft.Psi.MixedReality.Applications\LiveComputeServer.cs:line 90

at System.EventHandler`1.Invoke(Object sender, TEventArgs e) at Microsoft.Psi.Interop.Rendezvous.Rendezvous.TryAddProcess(Process process) in D:\xyf_software\psi\psi-master\Sources\Runtime\Microsoft.Psi.Interop\Rendezvous\Rendezvous.cs:line 73 at Microsoft.Psi.Interop.Rendezvous.RendezvousRelay.ReadProcessUpdate(BinaryReader reader) in D:\xyf_software\psi\psi-master\Sources\Runtime\Microsoft.Psi.Interop\Rendezvous\RendezvousRelay.cs:line 141 [11/14/2024 17:15:27.9377]: Stopping Compute Server Pipeline @2024/11/14 17:15:27. [11/14/2024 17:15:27.9377]: Removed LiveComputeServer process @2024/11/14 17:15:27. [11/14/2024 17:15:27.9647]: Stopped Compute Server Pipeline @2024/11/14 17:15:27.

D:\xyf_software\psi\psi-master\Applications\Sigma\SigmaComputeServer\bin\x64\Release\net472\SigmaComputeServer.exe (Process 22652) has exited with code 0. Press any key to close this window...

The screenshot corresponding to the stack trace is shown below: Image

######################################################################################################## I suspect that this issue is related to my speech configuration, but I'm not sure about the specific problem. My detailed configuration process is as follows:

  1. For Prerequisites, I've completed the prerequisites according to your instructions. It is worth mentioning that I just enable Developer Mode on my desktop, and didn't enable the Device portal and Device discovery on my desktop because they're not mentioned in the instructions.

  2. For Build and Deploy SIGMA, I've built and deployed SIGMA successfully.

  3. For Server (SigmaComputeServer) Configuration, I have created and edited 'sigma.server.config.xml', and I don't use LLM queries so I leave LLMQueryLibraryFilename empty and change TaskGenerationPolicy to 'FromLibraryOnly'. The corresponding screenshot is shown below. Image

  4. For Additional Server Configuration, I have obtained the Azure Speech service and created the appropriate environment variables based on my key and region. It is worth mentioning that I am a new user of Azure. I just created a resource group, then added the Azure speech service to this resource group, and set the corresponding key and region as environment variables. Apart from these, I didn't do any other configuration in Azure, and I really want to know if I have missed something that caused SIGMA not to run successfully. Image

  5. For Client (SigmaApp) Configuration, I have finished editing the 'sigma.client.config.xml' file and uploaded it back to the HoloLens2.

  6. For Additional Client Configuration, I've created a text file named 'CognitiveServicesSpeechKey.txt' containing the key and uploaded this file to the HoloLens2. I use the local spatial anchor, although the initialization of SIGMA has been stuck at 80%, I can see an anchor named ' _world' when I click 'Show World Anchor'. When I went to check ' sigma.client.config.xml', I found that the WorldSpatialAnchorId property appeared, and its value was ' _world'. Image Image

The above is my configuration process. I want to know if I have missed some configurations or configured something incorrectly. I would appreciate any reply and suggestions!

chitsaw commented 1 week ago

Thanks for the detailed info. It looks like the SystemVoiceActivityDetector language/culture defaults to en-us, and we are not properly handling cases where the en-us language pack is not installed on the system.

You can try either installing the en-us language pack (US English) in Windows, or for a quick workaround, try changing the Language property in SystemVoiceActivityDetectorConfiguration to match your system language/culture code (zh-cn ?).

Fenierxu commented 1 week ago

Thanks for your reply, I replace en-us with zh-cn and SIGMA runs successfully. Thank you very much for your help!