microsoft / psi

Platform for Situated Intelligence
https://github.com/microsoft/psi/wiki
Other
540 stars 96 forks source link

PsiStoreTool for ASR #328

Open kontogiorgos opened 2 months ago

kontogiorgos commented 2 months ago

Hi,

I am trying to export speech recognition results to csv using the PsiStoreTool. What is the right way to do that?

Using the save option from the tool I get the following error:

Error: Pipeline 'default' was terminated because of one or more unexpected errors (Unknown schema type name (Microsoft.Psi.Speech.IStreamingSpeechRecognitionResult, Microsoft.Psi.Speech, Version=0.19.100.1, Culture=neutral, PublicKeyToken=null). A synonym may be needed (see KnownSerializers.RegisterDynamicTypeSchemaNameSynonym())) (Cannot perform runtime binding on a null reference)

Thanks!

danbohus commented 1 month ago

I looked into this a bit and there are several issues in play re: why PsiStoreTools crashes on this.

My simple recommendation would be to write your own very small psi exporter program specific for this purpose with a pipeline that reads the speech reco results stream and writes the information in the format you desire (with the columns you desire) to a csv file.

Now back to PsiStoreTool and why it crashed. First, there's an issue in that PsiStoreTool does not know the type Microsoft.Psi.Speech.IStreamingSpeechRecognitionResult because it doesn't have a reference to the Microsoft.Psi.Speech project (where that type is defined). The "save messages" functionality in PsiStoreTool was intended I think for simple system types, but one could imagine that we could provide a functionality by which the user could specify via the command line a list of additional DLLs to dynamically load (so that it can find the types it needs), like PsiStudio does. Would be a nice feature to add and we'd welcome a PR on it.

Since we don't have that feature yet, for the moment, to test, I manually added a reference to Microsoft.Psi.Speech in PsiStoreTool and recompiled it. This indeed got me past the error you're reporting above, but unfortunately I ran into another problem, which I think may have to do with the fact that the persisted stream type is an interface rather than a class. We plan to investigate this further, but it will likely take longer.

So coming back to my original suggestion, I think the easiest/fastest path forward for you would be to write a custom exporter for your purposes.

kontogiorgos commented 1 month ago

Thank you Dan! I think indeed the easiest way is to write a custom exporter. It is also not clear what is the right way for the PsiStoreTool to store a non-incremental ASR output as it is not really a continuous stream in the same way a LogEnergy stream is.