nodemcu / nodemcu-firmware

Lua based interactive firmware for ESP8266, ESP8285 and ESP32
https://nodemcu.readthedocs.io
MIT License
7.65k stars 3.12k forks source link

RFD - PCM audio support #1085

Closed devsaurus closed 8 years ago

devsaurus commented 8 years ago

The recent additions of sigma-delta modulation with #1000 and a precise µs timer in #1057 open up the potential for audio support. Apart from a Lua interface, just some glue code is required to combine both into a simple mono audio back-end.

What I envision is support for playing wav-like files over any of the GPIOs:

Few external components for filtering will convert from digital to analog domain and attach to either a headphone jack or an active amplifier driving standard 4-8 Ω speakers. A quick 'n' dirty feasibility study is available in my pcm branch, complemented by some notes describing the external analog filter & amplifier and sample Lua code.

But before developing this into a PR, I'd like to check with the community whether such a module has a use case and is still within this project's scope. I'll follow up with API concept and architectural details once there's a strong indication that this functionality is considered to be useful.

devyte commented 8 years ago

If it were possible to add audio sampling (i.e.: microphone) in addition to this, then the ESP could be used for simple 2-way audio, which would be beyond awesome. If it's not possible, then this could still be useful. I have a friend who asked me if the ESP could play sounds, he wants to use it for fishing of all things. I'm pretty sure there could be other uses.

mikewen commented 8 years ago

+1, I remember ESP8266 support i2s interface, but could not find any code example.

jmattsson commented 8 years ago

What @devyte said!

I think it would be a challenge to make it run smoothly given the non-preemptive nature of the SDK (and shortage of RAM), but if it can be done it'd get a :+1: from me.

marcelstoer commented 8 years ago

Sounds exciting! How about a WeMos D1 mini with a stackable SD card shield that plays music autonomously when connected to speakers :smile:

devsaurus commented 8 years ago

Thanks for your inputs, guys.

Audio sampling Amazing idea - I didn't think of it yet. Although analog-to-digital conversion with sigma-delta isn't as straight forward as generating analog audio, I'll look into it. Maybe the ADC would be useful here, let's see.

I2S The ESP is said to have good hardware support for I2S, and there's the Mp3_Decode project by Espressif which can serve as a coding example. I haven't considered I2S so far for several reasons:

On the pro side this solution offers best audio quality and hardware streaming support.

Other audio solutions Slightly out of scope, but there are nice mp3 players on the market. Standalone ones or controllable via UART. Pros: better audio quality and kind of cheap. Cons: low level of integration and still more expensive that some Rs & Cs plus an audio amplifier

Real-time characteristics Definitely the main challenge IMO - feeding file data in real-time from SPIFFS to the audio back-end. I'm not yet sure if this can be done with 16 K sample rate, and I hope that the new tasking interface from #1061 could fill the gap. It was the major driver for opening the discussion before spending the effort to dig into this.

devsaurus commented 8 years ago

Had a deeper look into sourcing pcm data from files. This worked very well using the new task interface. The current implementation uses double buffering, each with 1024 bytes. Their size can probably be reduced further since margins are quite big as seen in the plots below.

Yellow channel traces audio signal. Green channel shows handshake timing between ISR and reader task.

overview flash_reload

TerryE commented 8 years ago

I am also interested in this short of approach for other bit banging drivers. Just an aside comment. But this is looking good :smile:

devsaurus commented 8 years ago

Yes, the concept of double buffering and filling them from a file reader task is quite generic. All specific logic is part of the ISR - feeding a DAC, or pushing patterns through GPIOs.

I don't have yet a final view on my implementation of the buffering stuff here, things are still moving. But the ingredients are clear: data producing task, data consuming ISR, and in between a fifo/ double buffering scheme. Having a shared, generic solution for the latter one should be feasible. But that's brainfood for a different issue.

devsaurus commented 8 years ago

Up to now I worked just on demonstrators to investigate certain feasibility aspects. The Lua API itself still needs to be settled. Your feedback was very helpful to rethink the overall structure.

pcm module

Play sounds through various back-ends. Supported hardware is sigma-delta (, I2S, XYZ).

pcm.new()

Initializes the audio driver.

Syntax

pcm.new(pcm.SD, pin) pcm.new(pcm.I2S, arg1, arg2, ...) pcm.new(pcm.XYZ, arg1, ...)

Parameters

pcm.SD use sigma-delta hardware

pcm.I2S use I2S hardware

Returns

Audio driver object.

Audio driver

Each audio driver provides the same control functions for playing sounds.

pcm.drv:close()

Stops playback and releases the audio hardware.

pcm.drv:on()

Register callback functions for events.

Syntax

pcm.drv:on(event[, cb_fn])

Parameters

Returns

nil

pcm.drv:play()

Starts playback.

Syntax

pcm.drv:play(rate)

Parameters

rate sample rate. Supported are pcm.RATE_1K, pcm.RATE_2K, pcm.RATE_4K, pcm.RATE_5K, pcm.RATE_8K, pcm.RATE_10K, pcm.RATE_12K, pcm.RATE_16K.

Returns

nil

pcm.drv:pause()

Pauses playback. A call to pcm.drv:play() will resume from the last position.

pcm.drv:stop()

Stops playback and releases buffered chunks.

dvv commented 8 years ago

I would vote pro:

devsaurus commented 8 years ago

Interesting input, thanks Vladimir!

introduce pcm.drv:play(chunk_as_string, rate, callback_fn(pcm.drv, event))

Will consider this for sure as it removes a lot of specific handling from the module. Adding a Lua call layer might slow down things, will check the timing impact later. I can think of the following events:

Why do you propose chunk_as_string? If the callback function returns the data as a string on the Lua stack then there'd be no need for chunk_as_string. Or do I miss a use case where this parameter is definitely required?

use driver type as second parameter to pcm.new(pin, typ)

My first sketch considered dedicated new() functions because each (future) driver might require different parameter sets. The sigma-delta needs to know the pin while I2S has a fixed pinning. Don't know which other info would be required to configure the I2S hardware.

pcm.drv:close() for pcm.drv_close()

Yes, that was a typo.

dvv commented 8 years ago

chunk_as_string meant we feed :play() with string (not table, e.g.), as it corresponds one-to-one to unsigned byte stream accepted by chosen format ("Raw, 8 bit unsiged format").

:new(pin, typ[[, specific], arguments]) would imho be consistent, with a dummy pin for I2S.

Callbacks: I would just report pause, stop and drain events leaving it to user to act on them. In drain one might want to feed more data, in pause accumulate/buffer/flush input, in stop stop feeding the player and mark things for exit.

devsaurus commented 8 years ago

My current model requires that the callback needs to feed data in time before the internal buffers are drained. This is why I plan to distinguish between data event and a drained event. The former is the request which has to be served as quick as possible, while the latter is the indication that continuous streaming ceased due to a lack of data.

A simple example :

function pcm_cb(d, event)
    if event == "data" then
        return file.read()
    elseif event == "drained" then 
        print("file done")
        file.close()
    end
end

file.open("output_16k.u8", "r")

drv = pcm.new_sigmadelta(1)
drv:play(pcm.RATE_16K, pcm_cb)
devyte commented 8 years ago

@devsaurus I'm curious about your callback model, it's different than the other ones I've come across so far. E.g.: connections:

srv = net.createServer(blah)
srv:on("disconnection", onDisconnection)
srv:on("sent", onSent)

Applying that model to your interface, it would look like this:

function onData(d)
  return file.read()
end
function onDrained(d)
  print("file done")
  file.close()
end
drv = pcm.newsigmadelta(1)
drv:on("data", onData)
drv:on("drained", onDrained)
drv:play(pcm.RATE_16K)

Notice that the if-else logic for the event type in your callback is eliminated.

devyte commented 8 years ago

I just read somewhere that the onboard ADC could maybe do 2.5KHz sample rate. that would give a theoretical mic bandwidth of 1250 Hz, which I think is too narrow for a mic. I guess sampling with the onboard ADC could still be attempted to check whether that's true, but most likely an external ADC over I2C or something would be needed to make it viable. Still, an implementation could be pursued similar to the pcm proposal above: a frontend with different possible backend ADCs.

devsaurus commented 8 years ago

@devyte Right, the example you gave for net is also used in mqtt and uart. A similar approach is found in wifi.sta.eventMonReg(), while other modules do callback registration with a single function like enduser_setup.start() and sntp.sync().

It appears that the on("event_name", cb_fn) pattern is the most common one. For sure this allows for a clearer separation between event handlers and eliminates the condition evaluation tree. Are there other pros? What would be the cons?

devsaurus commented 8 years ago

Regarding ADC I did a quick assessment of the obvious options in the meantime.

Up to now I don't see any promising approaches. My conclusion would change once there's an external solution which can be attached via an interrupt-driven or DMA-like interface.

devyte commented 8 years ago

@devsaurus Cons:

Pros:

On the other hand, a different approach could be used with one function per callback, i.e.:

drv:onDrained(onDrainedCallback)
drv:onData(onDataCallback)

or:

drv:onSent = onSentCallback
drv:onData = onDataCallback

This requires no string, but it does require one function per callback. I'm not sure what that means for lua under the hood, though, maybe strings are used to identify/lookup the function?

The first of the above is safer from a coding PoV, because the error checking is implicit and smaller than in your case.

The second is easier from an implementation PoV, because it doesn't require functions to be implemented, i.e.: the callbacks are just table entries. However, it's slightly more error prone (typos and such), with pretty much no diagnostics to detect them. Both are inconsistent with the rest of the callback setups elsewhere.

devyte commented 8 years ago

@devsaurus about a device on I2C, I've seen ESP projects with external I2C ADCs doing sampling rates of 40KHz. Didn't look at the details tho...to be honest, 20KHz would be pretty for a mic, and I think we could probably get away with as low as 8KHz.

devyte commented 8 years ago

Also, how about an ADC with SPI interface? They seem to be cheaper vs. I2C, I see a 4-channel one at USD$2.2 with sample rate of up to 200KSps, which is kind of overkill, of course. The I2C ones seem to go 6-12 bucks a pop. Needs more pins, of course...

devsaurus commented 8 years ago

Thanks for the detailed feedback on callbacks! I'm tending to switch to the :on("event") despite its potential cons. In the end it's in line with most of the modules dealing with callbacks. API sketch is updated accordingly.

Regarding the ADCs - do you have any links for future reference? I don't intend to rule out recording, but would leave this topic to a second iteration once audio generation is settled.

devyte commented 8 years ago

I found some cheaper I2C ones...

NCD9830 I2C, 8bit x 8ch, 2.5-70KSps @ just over 3 bucks ADC101C021 I2C, 10bit x 1ch, 189KSps, $2.52 MCP3004 SPI, 10bit x 4ch, 200KSps, $2.20

devyte commented 8 years ago

@devsaurus does this make any sense to you?

devyte commented 8 years ago

List of links to discussions of the ADC that I've come across (for future ref) Reliable audio adc timer-ing (same as link above) ADC?? ADC is slow ADC sample rate Build in SAR ADC PHY_ADC_READ_FAST()

It seems some people claim it's possible to do fast sampling with the internal adc, it's just that the wifi and task priorities make it unreliable. I also suspect that they're not using an efficient interrupt scheme. I was thinking along the lines of: -low level fast ISR servicing the ADC. All it does is read the sample and stuff it in a buffer. Testing this by itself could provide a hint of what sampling rate could be accomplished, and what the impact would be for wifi in AP, STA, STATIONAP or NULL modes. -high level callback: once a buffer is full, a higher level callback is called to take the buffer and propagate it upwards to lua. This is not called directly, of course, but via a scheduled task or something. -double (triple?) buffers: once the ISR fills a buffer, it gets swapped with the other one (next one?) which is standing by empty, and the higher level task for calling the callback gets created. That keeps the ISR lean and fast. -What frequency would make sense for the higher level callback? The lower the freq, the bigger the buffs => heap

Does that make any sense, or am I writing nonesense?

devsaurus commented 8 years ago

Time for a sign of life - most of the API sketch is implemented now in devsaurus/pcm. I'll continue more checks and clean-ups as time permits.

Phando commented 8 years ago

I have only ever used the online tool to make custom firmware. Is there a way to build devsaurus/pcm to include https support? I am very excited to play with some raw audio and the nodemcu.

TerryE commented 8 years ago

@Phando Joe, this isn't the right place to ask this sort of Q. Our support page give you links which provide forums for this type of Q.

Phando commented 8 years ago

Thanks and sorry. Keep it up, the build service is great

On Mar 31, 2016, at 8:59 PM, Terry Ellison notifications@github.com wrote:

@Phando Joe, this isn't the right place to ask this sort of Q. Our support page give you links which provide forums for this type of Q.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub

guillermo22 commented 8 years ago

Hi everybody! I am working with an ADC who works with SPI comunication to treat audio signals.(50Khz) (MCP3201). And this works good when i want to send the data through WIFI UDP using a time.alarm(). The problem is that this timer has a limit about 1mS to send. And how we know this needs too much more for audio signals. So i have tried a cicle WHILE infinite but it can not do the sending. I saw #367 but i don´t understand very well how it can modify the size of buffer, i mean do it bigger.. and finally to send the data audio files through wifi

nickandrew commented 8 years ago

@guillermo22, I think you should ask your question on the forum.

devsaurus commented 8 years ago

I'm closing this since the related PR is well in review loop.

navin-bhaskar commented 7 years ago

Hi,

I know this is a closed issue but recently I was trying out the "play_file.lua" example code. But when I run the code, I get the following error message:

PANIC: unprotected error in call to Lua API (bad argument #1 to '?' (file.obj expected, got userdata))

I changed the line drv:on("data", file.read) to drv:on("data", file) after that, I do not see any error but nothing happens on the pin and also drained cb gets called real quick. Any reason as to why this might happen?

devsaurus commented 7 years ago

@navin-bhaskar see #1712 for the fix of this error. The example was updated on dev branch at https://github.com/nodemcu/nodemcu-firmware/blob/dev/lua_examples/pcm/play_file.lua.

navin-bhaskar commented 7 years ago

Thanks! that worked.