Closed devsaurus closed 8 years ago
If it were possible to add audio sampling (i.e.: microphone) in addition to this, then the ESP could be used for simple 2-way audio, which would be beyond awesome. If it's not possible, then this could still be useful. I have a friend who asked me if the ESP could play sounds, he wants to use it for fishing of all things. I'm pretty sure there could be other uses.
+1, I remember ESP8266 support i2s interface, but could not find any code example.
What @devyte said!
I think it would be a challenge to make it run smoothly given the non-preemptive nature of the SDK (and shortage of RAM), but if it can be done it'd get a :+1: from me.
Sounds exciting! How about a WeMos D1 mini with a stackable SD card shield that plays music autonomously when connected to speakers :smile:
Thanks for your inputs, guys.
Audio sampling Amazing idea - I didn't think of it yet. Although analog-to-digital conversion with sigma-delta isn't as straight forward as generating analog audio, I'll look into it. Maybe the ADC would be useful here, let's see.
I2S The ESP is said to have good hardware support for I2S, and there's the Mp3_Decode project by Espressif which can serve as a coding example. I haven't considered I2S so far for several reasons:
On the pro side this solution offers best audio quality and hardware streaming support.
Other audio solutions Slightly out of scope, but there are nice mp3 players on the market. Standalone ones or controllable via UART. Pros: better audio quality and kind of cheap. Cons: low level of integration and still more expensive that some Rs & Cs plus an audio amplifier
Real-time characteristics Definitely the main challenge IMO - feeding file data in real-time from SPIFFS to the audio back-end. I'm not yet sure if this can be done with 16 K sample rate, and I hope that the new tasking interface from #1061 could fill the gap. It was the major driver for opening the discussion before spending the effort to dig into this.
Had a deeper look into sourcing pcm data from files. This worked very well using the new task interface. The current implementation uses double buffering, each with 1024 bytes. Their size can probably be reduced further since margins are quite big as seen in the plots below.
Yellow channel traces audio signal. Green channel shows handshake timing between ISR and reader task.
I am also interested in this short of approach for other bit banging drivers. Just an aside comment. But this is looking good :smile:
Yes, the concept of double buffering and filling them from a file reader task is quite generic. All specific logic is part of the ISR - feeding a DAC, or pushing patterns through GPIOs.
I don't have yet a final view on my implementation of the buffering stuff here, things are still moving. But the ingredients are clear: data producing task, data consuming ISR, and in between a fifo/ double buffering scheme. Having a shared, generic solution for the latter one should be feasible. But that's brainfood for a different issue.
Up to now I worked just on demonstrators to investigate certain feasibility aspects. The Lua API itself still needs to be settled. Your feedback was very helpful to rethink the overall structure.
Play sounds through various back-ends. Supported hardware is sigma-delta (, I2S, XYZ).
Initializes the audio driver.
pcm.new(pcm.SD, pin)
pcm.new(pcm.I2S, arg1, arg2, ...)
pcm.new(pcm.XYZ, arg1, ...)
pcm.SD
use sigma-delta hardware
pin
1~10, IO indexpcm.I2S
use I2S hardware
arg1
...arg2
...Audio driver object.
Each audio driver provides the same control functions for playing sounds.
Stops playback and releases the audio hardware.
Register callback functions for events.
pcm.drv:on(event[, cb_fn])
event
identifier
data
callback function is supposed to return a string containing the next chunk of data.drained
playback was stopped due to lack of data. The last 2 invocations of the data
callback didn't provide new chunks in time (intentionally or unintentionally) and the internal buffers were fully consumed.paused
playback was paused by pcm.drv:pause()
.stopped
playback was stopped by pcm.drv:stop()
.cb_fn
callback function for the specified event. Unregisters previous function if omitted.nil
Starts playback.
pcm.drv:play(rate)
rate
sample rate. Supported are pcm.RATE_1K
, pcm.RATE_2K
, pcm.RATE_4K
, pcm.RATE_5K
, pcm.RATE_8K
, pcm.RATE_10K
, pcm.RATE_12K
, pcm.RATE_16K
.
nil
Pauses playback. A call to pcm.drv:play()
will resume from the last position.
Stops playback and releases buffered chunks.
I would vote pro:
pcm.drv:play(chunk_as_string, rate, callback_fn(pcm.drv, event))
instead of file/network/whatever flavors.pcm.new(pin, typ)
pcm.drv:close()
for pcm.drv_close()
Interesting input, thanks Vladimir!
introduce
pcm.drv:play(chunk_as_string, rate, callback_fn(pcm.drv, event))
Will consider this for sure as it removes a lot of specific handling from the module. Adding a Lua call layer might slow down things, will check the timing impact later. I can think of the following events:
data
callback shall deliver further datastopped
by pcm.drv:stop()
paused
by pcm.drv:pause()
drained
stopped due to buffer underrunWhy do you propose chunk_as_string
? If the callback function returns the data as a string on the Lua stack then there'd be no need for chunk_as_string
. Or do I miss a use case where this parameter is definitely required?
use driver type as second parameter to pcm.new(pin, typ)
My first sketch considered dedicated new()
functions because each (future) driver might require different parameter sets. The sigma-delta needs to know the pin while I2S has a fixed pinning. Don't know which other info would be required to configure the I2S hardware.
pcm.drv:close()
forpcm.drv_close()
Yes, that was a typo.
chunk_as_string
meant we feed :play()
with string (not table, e.g.), as it corresponds one-to-one to unsigned byte stream accepted by chosen format ("Raw, 8 bit unsiged format").
:new(pin, typ[[, specific], arguments])
would imho be consistent, with a dummy pin for I2S.
Callbacks: I would just report pause
, stop
and drain
events leaving it to user to act on them. In drain
one might want to feed more data, in pause
accumulate/buffer/flush input, in stop
stop feeding the player and mark things for exit.
My current model requires that the callback needs to feed data in time before the internal buffers are drained. This is why I plan to distinguish between data
event and a drained
event. The former is the request which has to be served as quick as possible, while the latter is the indication that continuous streaming ceased due to a lack of data.
A simple example :
function pcm_cb(d, event)
if event == "data" then
return file.read()
elseif event == "drained" then
print("file done")
file.close()
end
end
file.open("output_16k.u8", "r")
drv = pcm.new_sigmadelta(1)
drv:play(pcm.RATE_16K, pcm_cb)
@devsaurus I'm curious about your callback model, it's different than the other ones I've come across so far. E.g.: connections:
srv = net.createServer(blah)
srv:on("disconnection", onDisconnection)
srv:on("sent", onSent)
Applying that model to your interface, it would look like this:
function onData(d)
return file.read()
end
function onDrained(d)
print("file done")
file.close()
end
drv = pcm.newsigmadelta(1)
drv:on("data", onData)
drv:on("drained", onDrained)
drv:play(pcm.RATE_16K)
Notice that the if-else logic for the event type in your callback is eliminated.
I just read somewhere that the onboard ADC could maybe do 2.5KHz sample rate. that would give a theoretical mic bandwidth of 1250 Hz, which I think is too narrow for a mic. I guess sampling with the onboard ADC could still be attempted to check whether that's true, but most likely an external ADC over I2C or something would be needed to make it viable. Still, an implementation could be pursued similar to the pcm proposal above: a frontend with different possible backend ADCs.
@devyte Right, the example you gave for net is also used in mqtt and uart. A similar approach is found in wifi.sta.eventMonReg(), while other modules do callback registration with a single function like enduser_setup.start() and sntp.sync().
It appears that the on("event_name", cb_fn)
pattern is the most common one. For sure this allows for a clearer separation between event handlers and eliminates the condition evaluation tree. Are there other pros? What would be the cons?
Regarding ADC I did a quick assessment of the obvious options in the meantime.
system_adc_read()
seems to be blocking until the next result is available (didn't check though) which would be a no-go for continuous sampling.Up to now I don't see any promising approaches. My conclusion would change once there's an external solution which can be attached via an interrupt-driven or DMA-like interface.
@devsaurus Cons:
Pros:
On the other hand, a different approach could be used with one function per callback, i.e.:
drv:onDrained(onDrainedCallback)
drv:onData(onDataCallback)
or:
drv:onSent = onSentCallback
drv:onData = onDataCallback
This requires no string, but it does require one function per callback. I'm not sure what that means for lua under the hood, though, maybe strings are used to identify/lookup the function?
The first of the above is safer from a coding PoV, because the error checking is implicit and smaller than in your case.
The second is easier from an implementation PoV, because it doesn't require functions to be implemented, i.e.: the callbacks are just table entries. However, it's slightly more error prone (typos and such), with pretty much no diagnostics to detect them. Both are inconsistent with the rest of the callback setups elsewhere.
@devsaurus about a device on I2C, I've seen ESP projects with external I2C ADCs doing sampling rates of 40KHz. Didn't look at the details tho...to be honest, 20KHz would be pretty for a mic, and I think we could probably get away with as low as 8KHz.
Also, how about an ADC with SPI interface? They seem to be cheaper vs. I2C, I see a 4-channel one at USD$2.2 with sample rate of up to 200KSps, which is kind of overkill, of course. The I2C ones seem to go 6-12 bucks a pop. Needs more pins, of course...
Thanks for the detailed feedback on callbacks! I'm tending to switch to the :on("event") despite its potential cons. In the end it's in line with most of the modules dealing with callbacks. API sketch is updated accordingly.
Regarding the ADCs - do you have any links for future reference? I don't intend to rule out recording, but would leave this topic to a second iteration once audio generation is settled.
I found some cheaper I2C ones...
NCD9830 I2C, 8bit x 8ch, 2.5-70KSps @ just over 3 bucks ADC101C021 I2C, 10bit x 1ch, 189KSps, $2.52 MCP3004 SPI, 10bit x 4ch, 200KSps, $2.20
List of links to discussions of the ADC that I've come across (for future ref) Reliable audio adc timer-ing (same as link above) ADC?? ADC is slow ADC sample rate Build in SAR ADC PHY_ADC_READ_FAST()
It seems some people claim it's possible to do fast sampling with the internal adc, it's just that the wifi and task priorities make it unreliable. I also suspect that they're not using an efficient interrupt scheme. I was thinking along the lines of: -low level fast ISR servicing the ADC. All it does is read the sample and stuff it in a buffer. Testing this by itself could provide a hint of what sampling rate could be accomplished, and what the impact would be for wifi in AP, STA, STATIONAP or NULL modes. -high level callback: once a buffer is full, a higher level callback is called to take the buffer and propagate it upwards to lua. This is not called directly, of course, but via a scheduled task or something. -double (triple?) buffers: once the ISR fills a buffer, it gets swapped with the other one (next one?) which is standing by empty, and the higher level task for calling the callback gets created. That keeps the ISR lean and fast. -What frequency would make sense for the higher level callback? The lower the freq, the bigger the buffs => heap
Does that make any sense, or am I writing nonesense?
Time for a sign of life - most of the API sketch is implemented now in devsaurus/pcm
. I'll continue more checks and clean-ups as time permits.
I have only ever used the online tool to make custom firmware. Is there a way to build devsaurus/pcm to include https support? I am very excited to play with some raw audio and the nodemcu.
@Phando Joe, this isn't the right place to ask this sort of Q. Our support page give you links which provide forums for this type of Q.
Thanks and sorry. Keep it up, the build service is great
On Mar 31, 2016, at 8:59 PM, Terry Ellison notifications@github.com wrote:
@Phando Joe, this isn't the right place to ask this sort of Q. Our support page give you links which provide forums for this type of Q.
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub
Hi everybody! I am working with an ADC who works with SPI comunication to treat audio signals.(50Khz) (MCP3201). And this works good when i want to send the data through WIFI UDP using a time.alarm(). The problem is that this timer has a limit about 1mS to send. And how we know this needs too much more for audio signals. So i have tried a cicle WHILE infinite but it can not do the sending. I saw #367 but i don´t understand very well how it can modify the size of buffer, i mean do it bigger.. and finally to send the data audio files through wifi
@guillermo22, I think you should ask your question on the forum.
I'm closing this since the related PR is well in review loop.
Hi,
I know this is a closed issue but recently I was trying out the "play_file.lua" example code. But when I run the code, I get the following error message:
PANIC: unprotected error in call to Lua API (bad argument #1 to '?' (file.obj expected, got userdata))
I changed the line drv:on("data", file.read) to drv:on("data", file) after that, I do not see any error but nothing happens on the pin and also drained cb gets called real quick. Any reason as to why this might happen?
@navin-bhaskar see #1712 for the fix of this error.
The example was updated on dev
branch at https://github.com/nodemcu/nodemcu-firmware/blob/dev/lua_examples/pcm/play_file.lua.
Thanks! that worked.
The recent additions of sigma-delta modulation with #1000 and a precise µs timer in #1057 open up the potential for audio support. Apart from a Lua interface, just some glue code is required to combine both into a simple mono audio back-end.
What I envision is support for playing wav-like files over any of the GPIOs:
Few external components for filtering will convert from digital to analog domain and attach to either a headphone jack or an active amplifier driving standard 4-8 Ω speakers. A quick 'n' dirty feasibility study is available in my
pcm
branch, complemented by some notes describing the external analog filter & amplifier and sample Lua code.But before developing this into a PR, I'd like to check with the community whether such a module has a use case and is still within this project's scope. I'll follow up with API concept and architectural details once there's a strong indication that this functionality is considered to be useful.