Open devongovett opened 8 years ago
Sounds good! Too bad whatwg/streams is not a very safe bet yet, but might be a good idea to keep the design constraints in mind anyway, even as we're targeting nodejs streams. WDYT about using flow for aurora.js(/av.js?) What about distribution? i.e. do we distribute a separate package for untranspiled JS? An UMD package?
Where should we start? Maybe sketch out a draft of how the public API would look like and nail that down, then start implementing? Regarding that, maybe WebAudioDevice
-> WebAudioSink
instead b/c device is somewhat misleading terminology we could get rid of while at it. ;)
Yeah I thought about whatwg streams, but I don't think they're ready yet, and node has a much larger ecosystem of compatible streams already available. My guess is that once whatwg streams are done, there will be compatibility layers bridging the two anyway since they aren't that different (at least in my brief reading).
I don't have a strong opinion on flow or another type system for that matter (e.g. typescript). Willing to be persuaded.
For distribution, maybe use browserify (or rollup?) for the default build, and require('av/es6')
for the es6 source files? Not sure. Rather not have two npm packages.
Agree on WebAudioSink
instead of WebAudioDevice
. Updated proposal.
I've started playing around with some of this already, actually :smile:. Currently working on extracting/updating the binary stream reading things. Have a proof of concept using node streams. Will publish to a branch soon. But yeah, let's work on a spec for the public API. Hopefully the high level stuff (e.g. player) doesn't have to change much.
Yeah I thought about whatwg streams, but I don't think they're ready yet, and node has a much larger ecosystem of compatible streams already available.
Agreed.
I don't have a strong opinion on flow or another type system for that matter (e.g. typescript). Willing to be persuaded.
So far every time I've used flow there's been a crucial (for me) feature missing so I haven't used the type checker much, but I've found the annotations are great form of documentation, both for reading the code as well as generating documentation from the code. All that in mind, I think they're a low cost (especially given how simple it is to configure Babel to strip them) / medium value addition. If we were to actually use the type checker as well, I'd consider them high value even, and might be doable if we constrain ourselves otherwise to pure ES2015, although not sure how well flow plays together with emitting events and such.
For distribution, maybe use browserify (or rollup?) for the default build, and require('av/es6') for the es6 source files? Not sure. Rather not have two npm packages.
I personally like lodash's model of distribution where they have the lodash
package that contains everything, and the default entry point is also a big module that contains all the things, but you can also import the stuff in it directly i.e. lodash/trim
, or even from its own package, i.e. lodash.trim
. I can help with building the tooling so we can do the same thing, if we want to. I can also help with other tooling like generating documentation, etc. :)
As for the separate package, it's a tradeoff I'm impartial to. However, it's worth noting that from the aurora/AV users' point of view, including the ES6 sources in the same npm package as the transpiled sources offer no benefit to users of either while increasing the package size for both.
I've started playing around with some of this already, actually 😄
💃
Extracted the binary stream reading stuff into stream-reader. Mostly exactly the same as the code in aurora, but converted to ES6 using decaffeinate. Docs etc. coming.
The main change is that I dropped the AV.Buffer
wrapper, and BufferList
is just a linked list of Uint8Array
s. This is made possible by the use of ES6 symbols for the previous and next pointers, which make it possible for the buffers to be in more than one BufferList.
A couple features to propose:
BufferList
a writable node stream, so you can pipe to it. This will have the effect of managing back pressure automatically when you read from a Stream
wrapping the BufferList
.maxTailBytes
(not sure of the best name) option on BufferList
to specify the number of bytes to keep in the list after advancing. Currently all buffers appended to the list remain in the list after you read them (to support rewinding). This can cause excessive memory use. There should be a limit to how far you can rewind by default, specified by the maxTailBytes
option.Looks like the stream-reader
name is already taken in npm, though it's a pretty old unmaintained library. So we'll either need a different name, or convince the guy to give us the package. :/
We could also (try to, might already be reserved) use a namespace: @av/stream-reader
- might be even better as stream reader is quite a generic name
Well, it is a pretty generic module. That's why we're breaking it out. :smile:
Generic yes, but the name could be more specific as it doesn't deal (directly) with the same Streams that node users would expect, and also it deals with a very specific type of streams (raw binary data).
Pushed an initial implementation of the streams stuff to the v2
branch in b0c69cd69a97ffda85b42867b4f02e47096b7e86. Still lots to do, but please feel free to leave comments. A few notes:
Decoder
as a Transform
stream subclass, but it was less performant than I wanted. This was because transform streams expect all of the input data in each chunk to be decoded at once. Because compressed media formats output data a lot more quickly than they take it in, this resulted in a lot of packets being decoded at once and therefore a lot of buffering. In order to spread the decoding out better over time, I made a custom transform stream by inheriting from Duplex
instead.readChunk
in the decoder to decodePacket
, which is more descriptive. Currently it calls readChunk
by default for backward compatibility.sbr
branch works with the new framework.Committed some more things. See here for a good comparison of everything so far (without the noise caused by removing all the existing code).
The main thing was refactoring Demuxer to move the common logic of dealing with streamed data to the base class, rather than individual demuxers. Most of the demuxers had a big while (stream.available(1))
loop, which is silly. Most media formats are structured in a series of discrete chunks, so it makes sense for the Demuxer#readChunk implementation to read a single chunk, rather than having to read all of the data available in the stream at once. Demuxer#readChunk will be called as necessary by the base class to consume data in the stream. If an underflow occurs, it will seek back to the last good offset, and try again when more data is available. This is the same behavior that decoders already have.
Moved the mp4 demuxer here: https://github.com/audiocogs/mp4.js.
Hi, just came across this thread and and just wanted to say it fits very closely with what I'm trying to do. I want to make a flexible multi-track audio component in the browser. As well as playing single files that contain multiple tracks, like the mp4 test in v2, I want to also want to handle the situation where we have one file per track, and be able to load and play them in a synchronised way. Do you have any plans for supporting this scenario?
I'm trying to decide whether to base my code on the master aurora branch, or this v2 one. I have some basic things working from the master branch, but the changes in v2 sound good. Has there been any progress since July, or any plans to work on it?
Am I correct in understanding that at this time, the v2 test only works in the node environment, primarily because you haven't settled on a stream implementation to use in the browser?
@lukebarlow yeah I want to finish this, but I'm super busy and have a lot of projects I'm working on at the moment. In the browser, we'll use node streams as provided by browserify. I had started on a WebAudioSink in 40d1c9299e3ef72a60013c335df8a83f67462d04.
As for synchronizing multiple files (or even multiple tracks), I hadn't started on anything yet. That would probably be done by the Player class, or perhaps an intermediary. It needs to do things like handle tracks of varying sample rates and media types (e.g. sync video with audio).
Okay, no problem. Thanks for the speedy reply. Do you have any kind of test code which shows the WebAudioSink in action?
+1 for ES6. I would contribute to this project if it wasn't for the CoffeeStuff.
do you guys have any timeline for this?
👍 Been following this project for a while and would happy to contribute in any capacity to this effort.
I've ported most everything to ES6 but also made several changes to the structure so I'm not sure how useful it would be moving forward, 3 tests still failing but I'm working on those and to get it cleaned up and back to working order.
Some other decaffienated aurora efforts here - https://github.com/alexanderwallin/multitrack-audio-element/tree/master/.idea/decaffeinated-aurora
Everything is passing now (except one odd M4A chapter test) and I've begun expanding the code coverage at this point.
@MatthewCallis interesting. I'd like to do more of a refactor here rather than a straight port to ES6, but that's good for now. Not sure when I'll have time.
@devongovett cool! I did it to learn more about the code base and my own codec was not like anything I had seen tackled yet (tracked music in proprietary formats) and what was needed to facilitate that easier. Here to help if you need it!
Aurora.js has been around since 2011, and since then the JS and web audio ecosystems have improved quite a bit. We got ES6, much better build tools, the Web Audio API was implemented cross browser, Node.js streams were invented, etc. Given these changes, I think it's time to modernize Aurora a bit. This is a proposal for the changes I'd like to make. Many of these would be breaking changes, so we would need to bump the major version.
Here's an overview:
ES6
I'd like to switch the codebase away from CoffeeScript, and convert it to ES6. CoffeeScript served us well, and paved the way for many of the features in ES6, but it hasn't been updated in a while and ES6 has largely superseded it. And with tools like Babel, we can use it everywhere without compatibility problems. Also, many more people are familiar with ES6/plain JS, so it will encourage more outside contribution. Therefore, I think it is time to move on from CoffeeScript.
Streams
When we started, Node.js was in its infancy, and real streams were not invented yet. We basically had event emitters. So, Aurora ended up building its own sort-of streams, which have problems. There is no back-pressure, so the source just reads as fast as possible. For large files, this means we buffer a lot into memory before it is needed.
New Node.js streams (introduced in node v0.10 around 2013) have support for back-pressure built-in, so when you pipe one stream to another, the source will automatically slow down or speed up depending on how fast the downstream readers are. They are also a standardized interface that many projects have adopted, so you can compose streams from different authors together very easily.
I'd like for Aurora.js to adopt Node streams across the board. This should be transparent for the browser as well, thanks to browserify. This means that many of the custom source classes that we have (e.g. file and http) can be removed since they exist in other projects (e.g.
fs.createReadStream
in node).The one problem with Node streams, is that they do not support seeking out of the box. Once you start a stream, you cannot easily jump to another part of it. For our purposes, I think an extension to readable streams for sources to support seeking would work. When seeking, we would flush the internal buffers of the demuxers and decoders, and then seek the source.
Multitrack
The npm module is called
av
(maybe we should consider changing the github project name to match?) since aurora was taken, but it is perhaps a better name since we may want to support video one day. In preparation for that, I think theDemuxer
classes should be refactored to support multi-track media, e.g. video, audio, subtitles, etc. Here's the interface I'm proposing:track
events with Track objects when they become available. It would continue to emitmetadata
events as well, but everything else would become part of the track.Demuxer
was before). Tracks would have atype
(audio, video, etc.),format
,duration
, seek points, etc. as demuxers did before.Here's an example of how you might use the new interface to play an audio track:
Modularize
Aurora.js core is already pretty small, but it could be smaller, and pieces could be made reusable by other projects. Here is what I'm proposing:
BufferList
,Stream
, andBitstream
classes. I think theAV.Buffer
class can be removed, and we can just useUint8Array
s or nodeBuffer
s maybe.node-speaker
does much of what we need on Node.js. Remove the Mozilla audio device, since they've implemented the Web Audio API in Firefox a while ago now.So what would be left in Aurora core?
Web Audio
Currently, Aurora.js uses the Web Audio API for playback in browsers, but it creates its own AudioContext, so it's hard to integrate with more complex setups where you want to do audio processing on the decoded output from Aurora. I'd like to make it possible to use Aurora as just another node in the Web Audio API graph. I propose splitting the current
WebAudioDevice
into two pieces:connect
to other nodes in the graph. It would do the resampling and format conversions necessary to get raw data into web audio.Here's an example showing how you might connect a decoder to a web audio graph and do some further processing:
WebAudioStream could live in a separate module, since it might be useful to other projects.
Backward Compatibility
I'd like to cause as few changes as possible to the existing demuxers and decoders out there in the world. The interface to stream/bitstream reading would remain the same, and that's the biggest surface area used by plugins. It's pretty easy to switch from emitting data to writing to a track in the demuxers. And the decoders should work exactly the same way. We would get rid of the
AV.Base
class, which was our class abstraction for plain JS before, so rather thanAV.Demuxer.extend
, we'd either need to switch to using ES6 (preferred), or just use prototypes.Conclusion
Overall, I think the changes I described above would modernize the framework quite a bit, and make it easier to use, and contribute to. It would also make the core considerably smaller, and make our code more easily reusable by other projects. This is obviously a large project, and it wouldn't be done overnight, but I think it's a good direction to go in. Please let me know what you think!