cmusphinx / pocketsphinx

A small speech recognizer
Other
3.9k stars 714 forks source link

make the documentation? #277

Open petercwallis opened 2 years ago

petercwallis commented 2 years ago

I would like to look at the documentation for the API in C. I am guessing I need to type 'make,' or possibly 'cmake' somewhere so that doxygen produces the html files. I know enough to be dangerous so it has taken close on half a day to not figure this out. Could we have "make docs" a thing in the top level directory possibly? Or possibly { > make C-docs } perhaps ...

dhdaines commented 2 years ago

Hello, for the "5prealpha" release of a few years ago, the documentation is available here: https://cmusphinx.github.io/doc/pocketsphinx/

It will be updated for the first release candidate. To create it, you need Doxygen (https://doxygen.nl/) installed, and you can simply run "make docs" in your build directory, exactly as you say :)

dhdaines commented 2 years ago

(of course, this is not documented so I will leave this issue open until it is!)

petercwallis commented 2 years ago

Hi David, okay took a look and tried writing some documentation as I'd aspire to write it (attached). It does not reflect what is in the existing api, but for the life of me I cannot see how one would connect a microphone or open a file from that. Please take a look, and if you like it perhaps we could have a conversation about me doing and maintaining the documentation for pockesphinx5. Note it is full of inaccuracies as my C is quite old and I only did minimal cpp before switching to Java, and I certainly have an OO mindset. The use-case approach is however the key, I think, to producing something people can use when they don't know how it works.

Best wishes and thanks again for taking an interest in resurrecting this, P.

On Wed, 10 Aug 2022 at 19:45, David Huggins-Daines @.***> wrote:

(of course, this is not documented so I will leave this issue open until it is!)

— Reply to this email directly, view it on GitHub https://github.com/cmusphinx/pocketsphinx/issues/277#issuecomment-1211119684, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALY3LWJQ4KRMZL4SUHH3MKTVYP2GBANCNFSM56FQ3HWA . You are receiving this because you authored the thread.Message ID: @.***>

dhdaines commented 2 years ago

Yes, because opening files and attaching to microphones is outside the scope of what a speech recognition engine should do, if you look at the others out there they generally don't do it either. It is extremely platform dependent, and there are a variety of libraries which work way better than what was in PocketSphinx, which didn't work for most people.

I'm quite firm on this point as I don't have time to maintain yet another audio library (likewise, anything Java or really any language except C, Python, and JavaScript). I am working on a post for the CMUSphinx website which explains the rationale and provides alternatives.

petercwallis commented 2 years ago

Okay, it was just an offer of help. Drawing a line around the speech recognition engine is exactly what an API should do. The doxygen output lists the functions in an attempt to say how it works; an API should specify how to use it.

Providing one and only one "test harness" (in C) is what I'm suggesting - let others create their own platform/language dependent interfaces. For that to work however, the API needs to be crystal clear - and fit with the variety of libraries out there.

The only thing I'm quite firm on is that use-cases are the way to structure the manual.

let me know if I can help, P

ps Have you played with raspberry pis? At 35 quid it might not be unreasonable to suggest people set up a pi as a dedicated speech processor and communicate with that via serial connection.

petercwallis commented 2 years ago

... Gstreamer perhaps as a framework? https://gstreamer.freedesktop.org/documentation/application-development/index.html?gi-language=c

On Thu, 11 Aug 2022, 15:02 Peter Wallis, @.***> wrote:

Okay, it was just an offer of help. Drawing a line around the speech recognition engine is exactly what an API should do. The doxygen output lists the functions in an attempt to say how it works; an API should specify how to use it.

Providing one and only one "test harness" (in C) is what I'm suggesting - let others create their own platform/language dependent interfaces. For that to work however, the API needs to be crystal clear - and fit with the variety of libraries out there.

The only thing I'm quite firm on is that use-cases are the way to structure the manual.

let me know if I can help, P

ps Have you played with raspberry pis? At 35 quid it might not be unreasonable to suggest people set up a pi as a dedicated speech processor and communicate with that via serial connection.

dhdaines commented 2 years ago

Yes, GStreamer is nice... the current pocketsphinx code is missing a plugin for it, but it is simply a matter of updating the build system.

If you would like to contribute documentation that would be fantastic - sorry if I suggested otherwise, in fact this is the best kind of contribution anyone could make! The main page for documentation is in the pocketsphinx.h header file, and I think this is a good place for it as it means that people can read it without necessarily having to run Doxygen.

lenzo-ka commented 2 years ago

pyaudio also works if you want live mic audio

petercwallis commented 2 years ago

That's great; let's see how it works out. I am not the easiest person to get on with so if it doesn't work out, well that is to be expected.

I didn't look too closely at streamer but i got the impression that it would be the other way round ps being (set of) plugins for the gstreamer framework. See https://gstreamer.freedesktop.org/documentation/application-development/introduction/gstreamer.html?gi-language=c

Does this make sense?

On Fri, 12 Aug 2022, 13:42 David Huggins-Daines, @.***> wrote:

Yes, GStreamer is nice... the current pocketsphinx code is missing a plugin for it, but it is simply a matter of updating the build system.

If you would like to contribute documentation that would be fantastic - sorry if I suggested otherwise, in fact this is the best kind of contribution anyone could make! The main page for documentation is in the pocketsphinx.h header file, and I think this is a good place for it as it means that people can read it without necessarily having to run Doxygen.

— Reply to this email directly, view it on GitHub https://github.com/cmusphinx/pocketsphinx/issues/277#issuecomment-1213071173, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALY3LWL2SHWGWHBO7VCPIU3VYZBCZANCNFSM56FQ3HWA . You are receiving this because you authored the thread.Message ID: @.***>

dhdaines commented 2 years ago

Here is a post explaining the removal of audio, with the simplest possible example code: https://cmusphinx.github.io/2022/08/pocketsphinx-continuous/

I'm writing a second one to explain doing VAD once I complete the integration of the VAD code.

petercwallis commented 2 years ago

Looking good! I installed sox easy enough on this raspberry pi and tomorrow (between other things) I will try and get "simple.c" running. As a typical "user" (as opposed to a ASR researcher) one thing I can offer is to document my experience as a first time user. I will do that and get back to you. p

On Tue, 16 Aug 2022 at 15:06, David Huggins-Daines @.***> wrote:

Here is a post explaining the removal of audio, with the simplest possible example code: https://cmusphinx.github.io/2022/08/pocketsphinx-continuous/

I'm writing a second one to explain doing VAD once I complete the integration of the VAD code.

— Reply to this email directly, view it on GitHub https://github.com/cmusphinx/pocketsphinx/issues/277#issuecomment-1216687104, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALY3LWOUCAZWQJBHWGM3UIDVZON7VANCNFSM56FQ3HWA . You are receiving this because you authored the thread.Message ID: @.***>

dhdaines commented 2 years ago

Note that the simple example is ... very simple :) it will seem to do nothing until you hit Control-C, which maybe isn't optimal, so I will fix that in a second. It also requires the -hmm, -lm, and -dict arguments, this too should get fixed in a second.

petercwallis commented 2 years ago

Sorry I have been distracted the last few days and not done anything other than look at the code. Hope the VAD development is going smoothly; just wanted to let you know I am still very interested and will get more active soon. p

On Tue, 16 Aug 2022 at 16:28, David Huggins-Daines @.***> wrote:

Note that the simple example is ... very simple :) it will seem to do nothing until you hit Control-C, which maybe isn't optimal, so I will fix that in a second.

— Reply to this email directly, view it on GitHub https://github.com/cmusphinx/pocketsphinx/issues/277#issuecomment-1216795371, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALY3LWLNKACQJBSPQXGRHOLVZOXQ7ANCNFSM56FQ3HWA . You are receiving this because you authored the thread.Message ID: @.***>

dhdaines commented 2 years ago

No problem at all! Thanks for your offer to help. I am hoping to make a first release candidate today - I will be on vacation for a couple of weeks starting Monday, so that will be a good time to find the problems and write some documentation.

petercwallis commented 2 years ago

... except I'm on vacation from Monday as well :-) (one week only though)

On Fri, 19 Aug 2022 at 21:46, David Huggins-Daines @.***> wrote:

No problem at all! Thanks for your offer to help. I am hoping to make a first release candidate today - I will be on vacation for a couple of weeks starting Monday, so that will be a good time to find the problems and write some documentation.

— Reply to this email directly, view it on GitHub https://github.com/cmusphinx/pocketsphinx/issues/277#issuecomment-1221080050, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALY3LWMLPCAXGQFHCA2GIRTVZ7XAXANCNFSM56FQ3HWA . You are receiving this because you authored the thread.Message ID: @.***>

petercwallis commented 2 years ago

Hi David, I see you have got back and got busy. I have had a bit of a fiddle about with some suggestions on how I'd do documentation and got bogged down with sox. I need to get back to another project for a bit so thought I'd show you where I am heading in the hope you approve, and perhaps you can fix my sox scripting, style, and perhaps make a few corrections with explanations that I should incorporate. I think my code is self explanatory :-) but the gist is that in order to draw a line around pocket sphinx vs the environment, the methods in this would be part of ps, and the main(..) would be test code to ensure the environment was set up and working. audio file to speaker; audio file to audio file; mic to audio file... and in the near future internet streamy thing to file; and microhone to web server. just one format.

best, p

On Sat, 20 Aug 2022 at 10:37, Peter Wallis @.***> wrote:

... except I'm on vacation from Monday as well :-) (one week only though)

On Fri, 19 Aug 2022 at 21:46, David Huggins-Daines < @.***> wrote:

No problem at all! Thanks for your offer to help. I am hoping to make a first release candidate today - I will be on vacation for a couple of weeks starting Monday, so that will be a good time to find the problems and write some documentation.

— Reply to this email directly, view it on GitHub https://github.com/cmusphinx/pocketsphinx/issues/277#issuecomment-1221080050, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALY3LWMLPCAXGQFHCA2GIRTVZ7XAXANCNFSM56FQ3HWA . You are receiving this because you authored the thread.Message ID: @.***>

petercwallis commented 2 years ago

Okay! I am back on it now and have learnt how to use sox. I will continue to work on it this week (between things) and send you a draft text with sample code plus some stuff you would need to put in the pocket sphinx library. p

On Sun, 4 Sept 2022 at 08:48, Peter Wallis @.***> wrote:

Hi David, I see you have got back and got busy. I have had a bit of a fiddle about with some suggestions on how I'd do documentation and got bogged down with sox. I need to get back to another project for a bit so thought I'd show you where I am heading in the hope you approve, and perhaps you can fix my sox scripting, style, and perhaps make a few corrections with explanations that I should incorporate. I think my code is self explanatory :-) but the gist is that in order to draw a line around pocket sphinx vs the environment, the methods in this would be part of ps, and the main(..) would be test code to ensure the environment was set up and working. audio file to speaker; audio file to audio file; mic to audio file... and in the near future internet streamy thing to file; and microhone to web server. just one format.

best, p

On Sat, 20 Aug 2022 at 10:37, Peter Wallis @.***> wrote:

... except I'm on vacation from Monday as well :-) (one week only though)

On Fri, 19 Aug 2022 at 21:46, David Huggins-Daines < @.***> wrote:

No problem at all! Thanks for your offer to help. I am hoping to make a first release candidate today - I will be on vacation for a couple of weeks starting Monday, so that will be a good time to find the problems and write some documentation.

— Reply to this email directly, view it on GitHub https://github.com/cmusphinx/pocketsphinx/issues/277#issuecomment-1221080050, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALY3LWMLPCAXGQFHCA2GIRTVZ7XAXANCNFSM56FQ3HWA . You are receiving this because you authored the thread.Message ID: @.***>