Speech.jl? - Githubissues

jfsantos commented 10 years ago

I would like to start drafting a new package for speech signal processing, focused mainly on speech feature extraction (MFCCs, LPCs, fundamental frequency, etc). @davidavdav has a lot of work on MFCCs at MFCC.jl but, as last time we talked, it needed some updates. @davidavdav, would you mind chiming in with your comments and suggestions? Thanks!

davidavdav commented 10 years ago

MFCCs is now in METADATA, and apart from default parameter settings that should mimic HTK defaults (we could use some testing there) it has some other parameter sets (:rasta should mimic the default parameters of the rastamat package).

It further has various forms of feature normalization (mean/variance: znorm() and short time Gaussianization: warp()), derivatives (delta()) and shifted-delta-cepstra (sdc's, used in language recognition).

We could use some additional code to compute PLP (perceptual linear prediction) coefficients, RASTA processing. Other might be interested in LPC estimation, pitch extraction, etc---for recognition this is not too useful, but for (re)synthesis it may be.

I have higher level code in https://github.com/davidavdav/Feacalc.jl.git which can read .wav files and has some trivial energy-based speech activity detection, and save/load routines in HDF5 (for compatibility with non-julia software).

jfsantos commented 10 years ago

That's great! So since MFCC is in METADATA, maybe we should aim for a higher degree of specialization in speech-related packages rather than having a single mega-package.

davidavdav commented 6 years ago

I suppose it would be better if MFCC would be moved into JuliaDSP, for tighter integration with DSP, and longer duration maintenance. How would you guys think about this and how would we proceed to do that?

ssfrr commented 6 years ago

I think moving it into JuliaDSP seems pretty reasonable, MFCCs are pretty important (even outside of speech, e.g. music processing), so it's nice to have that stuff in an org rather than a personal repo.

ssfrr commented 6 years ago

If other folks are onboard I think the process is roughly:

transfer ownership of the repo to the org through the repo settings
update the repo URL in METADATA (github will redirect the original URL appropriately, but it's nice to have the right one there)

incidentally JuliaAudio could also be a reasonable org for this to live in, though that family of packages is somewhat more opinionated w.r.t. samplerate-aware buffer and stream types so the package might need some minor refactoring to interoperate nicely with them.

davidavdav commented 6 years ago

Could you add me to JuliaDSP then? I tried to transfer ownership, but that didn't work as I was not allowed.

ssfrr commented 6 years ago

I'm actually not a JuliaDSP member either, so I can't add you

davidavdav commented 6 years ago

All right, MFCC.jl is now part of JuliaDSP.

JuliaDSP / Roadmap

Speech.jl? #3