Problem with Barracuda.Tensor

mbaske / ml-audio-sensor

Audio Sensor Component for Unity ML-Agents

MIT License

30 stars 7 forks source link

Problem with Barracuda.Tensor #2

Open barleymalt opened 2 years ago

barleymalt commented 2 years ago

Hi! I'm having troubles setting up a speech recognizing agent. It works similarly to the audio agent that recognizes numbers, but it's based on a model that's been trained with pytorch using invented words. I actually have little understanding of how the audio sensor works under the hood, so I'm just feeding a model that was made by someone else and passing it through the behaviour parameter. Now I keep getting this error and I don't know why: error see error stack trace, part 1 see error stack trace, part 2

I'm just trying to debug blindly, hoping to find the reason why this error keeps coming out. I have no idea what could be generating it and if anyone could point me out some direction it would be great. Thank you

mbaske commented 2 years ago

Sorry, but I don't think this can work with pre-trained models, because my sensor setup requires specific input data. I developed this as an experiment to see if audio recognition would be possible at all within the ML-Agents framework. For that, the sensor utilizes the visual observation pipeline, audio sample values are effectively encoded as pixel data. However, you can try using the number recognizer scene as a starting point, and train an agent with words instead. Since ML-Agents does reinforcement learning rather than classification, output values are represented as discrete actions. So for the number recognizer, each number has an associated action and the agent is rewarded for picking the correct one. If you do this with words, you would likely need to increase the action space accordingly.

barleymalt commented 2 years ago

Hi, thanks for the answer! I made some more research in the past days, downloaded the original project from your repositiory and tried to build on iOS. I found out to my surprise that the same error that was showing on iOS, appears also in the editor on OSX, with iOS as the target platform:

(I couldn't find this out before because I was using the cloud build)

The project is vanilla, so it shouldn't have to do with the model we are using. Any ideas about what could be the cause of the error?

mbaske commented 2 years ago

Hi, I can't reproduce the error - here's what I tried: Cloned the repo and opened the project in Unity 2020.3.23f1 Personal on macOS 11.6.2. Started the SpeechRecognition scene in the editor and inference appears to work fine with the supplied model.

barleymalt commented 2 years ago

Did you also try to set iOS as the target?