tue-robotics / tue_robocup

RoboCup challenge implementations
https://github.com/orgs/tue-robotics/projects/2
42 stars 12 forks source link

[Story] Speech Recognition #1012

Open ar13pit opened 4 years ago

ar13pit commented 4 years ago

Stage 1

Stage 2 (Enhancements to Yapykaldi)

Low priority but important

LoyVanBeek commented 4 years ago

I'll take up the

Create a ros HMI client wrapper for yapykaldi (yapykaldi_ros)

ar13pit commented 4 years ago

I'll take up the

Create a ros HMI client wrapper for yapykaldi (yapykaldi_ros)

Great. There is one feature that is needed there though, caching of the grammar string if the same grammar was sent in the previous HMI query as compiling the speech model all the time will be very expensive.

LoyVanBeek commented 4 years ago

Allright. I've spent most my time setting up the dependencies and I need to be on 18.04 I think though, so I'll continue next week.

LarsJanssenTUe commented 4 years ago

Allright. I've spent most my time setting up the dependencies and I need to be on 18.04 I think though, so I'll continue next week.

Needing to be on 18.04 would be undesired if not explicitly necessary. Why is this needed @ar13pit ?

ar13pit commented 4 years ago

No its not needed. I wrote that in the README as I explicitly tested it on 18.04 but it will work on 16.04 as well as along as CUDA is not being used and gcc7 is used.

LoyVanBeek commented 4 years ago

Hmm, kaldi complained about needing CMake 3.12 or somethingh while by 16.04 box has 3.5 and could not get a higher version installed yet.

ar13pit commented 4 years ago

Did you try running the command tue-get install python-yapykaldi ? It should install all the dependencies, build kaldi and yapykaldi.

However, if you don't want to do that go with pip2 install --user cmake. You won't have to spend time building it from source.

LoyVanBeek commented 4 years ago

@ar13pit is https://github.com/gooofy/zamia-speech/#model-adaptation what you meant with "Zamia speech JSGF"?

There is also https://pypi.org/project/pyjsgf/

ar13pit commented 4 years ago

Yup and that section is a huge pile of shit.

LoyVanBeek commented 4 years ago

@ar13pit https://github.com/tue-robotics/yapykaldi_ros/pull/1 and https://github.com/tue-robotics/yapykaldi/pull/1 can be reviewed

LoyVanBeek commented 4 years ago

Decide where parsing of the output of yapykaldi into semantics must be done. In yapykaldi or in the ros wrapper. I currently have this in the ROS wrapper and I think that makes the most sense too.

Same for the converting to JSGF format: I'd say that happens in the ROS wrapper and the JSGF grammar is passed to Asr.recognize or Asr.start.

ar13pit commented 4 years ago

I actually getting rid of JSGF completely as that is also an intermediate format. Instead I'll keep NLTK grammar object as that is pretty close to our grammar parser. So this object is passed onto Asr.recognize and a new ASR model is created.