Ravenstine / airsocket

Send/receive messages through an audio signal.
MIT License
17 stars 2 forks source link

duplex #1

Open mvayngrib opened 8 years ago

mvayngrib commented 8 years ago

very cool library!

is it possible to have full duplex communication if the two parties use different frequencies? Also, what's the maximum bytes/second you've achieved?

Ravenstine commented 8 years ago

@mvayngrib Theoretically, it is possible, though it is not something that I have yet tried. It would take some trial-and-error experimentation to find the right frequency pairs that clash the least. There's probably some way to do this mathematically, but I don't know what that is and arrived at the default frequencies I chose because they had the best accuracy for the least audibility(on my macbook, anyway). Right now, I don't think the singular AirSocket instance would support duplex, but creating your own instances of AirSocket.Encoder and AirSocket.Decoder could possibly allow one to do this.

From memory, I think the fastest rate I may have been able to transmit was around 150 bytes per second. But this is in no way definitive of what is possible, and my memory could be mistaken. Even if one could achieve a higher speed, my experience is that far more errors occur over the air the higher the transfer rate is. It's probably due to the acoustics of rooms, etc. I imagine that a closed system(like a telephone line) would mitigate that issue and it's entirely possible that kilobytes-per-second could be achieved that way.

I'm on vacation at the moment, but would be interested in finding more definitive answers to these questions when I'm back.

mvayngrib commented 8 years ago

@Ravenstine thanks for responding. I just ran across quiet-js, have you seen it? Try out https://quiet.github.io/quiet-profile-lab/ With the ultrasonic-3600 setting it gets up to 3000bps (375 bytes/second), pretty good. https://github.com/romanz/amodem claims to achieve 10kB/s, which is amazing, but I haven't tested it and it's Python. I'm interested in a Javascript solution.

I haven't tested it with duplex yet, but it seems easy to try: "ultrasonic-3600" paired with "audible"

Ravenstine commented 8 years ago

@mvayngrib No, I haven't seen quiet-js, but that's pretty cool. I was actually considering moving some components over to C and compiling them with Emscripten.

I also haven't seen amodem(or at least I don't remember it). While I only took a brief look at it, my guess is that it's more complex than Airsocket; the messages(essentially packets) Airsocket sends are of a fixed size and there's no concept of consecutive data packets. This was because I initially wrote this to send only SHA & MD5 hashes, which are always the same sizes and it allowed me to keep things simple. I do want to build out Airsocket so send data of variable length in multiple packets, support a better checksum, possibly error correction codes, etc. I even think the future would be to replace components like Goertzel.js as I'm sure there are faster ways of demodulating signals.

Also, getting this to work on mobile devices is a primary goal. But it has been a major headache as I have been barely able to make it work on my Android phone.

mvayngrib commented 8 years ago

@Ravenstine we have the same goal, to get one of these solutions working on mobile. We're using React Native, which will let us use Javascript projects on both iOS and Android, but we'll need access to sound production/recording APIs on Android and iOS. (If you're not familiar with React Native, to access native functionality on iOS/Android you need to create bridge modules for each platform).

Ravenstine commented 8 years ago

@mvayngrib I'm not that familiar with React Native, though that essentially sounds like how Cordova plugins work. It's possible that will yield a better outcome than what I was able to achieve using the Web Audio APIs, as the latter provides limited access to the microphone input. There was only a handful of times that my phone(Samsung Galaxy S5) was able to decode a signal, which suggests that either it's not using the correct sample rate, or it has something to do with a microphone hardware setting. The audio analyzer app on my phone shows a clean spectrogram, so there's no good reason why AirSocket can't work on mobile.

If you are trying to achieve a long distance transmission, just be aware that reliably decoding a signal over an appreciable distance in a room has proven to be very difficult with the methodology I chose. From what I have read, modems are somehow looking for shifts in a carrier frequency as opposed to assigning two specific frequencies to each bit value, the latter being what AirSocket does because it was simple enough to achieve by doing an analysis of small chunks of samples. I'm not sure if this explains why distance as well as the acoustics of a room badly damage the integrity of a message, or if it has more to do with the means by which messages are sent(e.g. actual data packets, error correction techniques, etc.). On the other hand, if your goal is for two devices to be right next to each other, the problem I described is probably moot.

Even if you don't end up using any parts of this project in your app, there's nothing about it that's completely married to browser APIs. The backbone and the spine of AirSocket are soundrive and goertzel.js. Both are my projects as well, and I wrote/used them for portability reasons. Soundrive is a good example of how to write your own oscillator with sweeps/eases, if you even need to do that.

Ravenstine commented 8 years ago

Here are some early ideas I have on improving the protocol:

mvayngrib commented 8 years ago

@Ravenstine good ideas. Does quiet-js use mark/space technique as well? Thing is, I think quiet-js has already solved most of these problems (not the duplex comm, but maybe everything else). If you play around with that quiet-profile-lab, it shows you as it receives "frames", and it definitely does error codes. Not to devalue this library, but I think if the goal is to have a proof of concept for duplex comm with a decent bandwidth, quiet-js is our best bet. What do you think?

Ravenstine commented 8 years ago

@mvayngrib At this point, quite possibly. Especially if it currently works on mobile out of the box, whereas Airsocket is still failing to decode messages on mobile Chrome 99% of the time. Libquiet is more complex than my library is, which might mean faster and more consistent results, but not necessarily. ¯_(ツ)_/¯

The frames in quiet-js aren't really indicative of whether or not frames are truly used in the decoding process. I think they are referring to the buffers that come in from the audio API. However, it appears that quiet may be using frames at some point: https://github.com/quiet/quiet/blob/master/src/demodulator.c#L16

They are also using a ring buffer, which might suggest that they are periodically iterating over a collection of samples to see whether or not a change from the carrier frequency is occurring, but I could be mistaken. My knowledge of C syntax isn't that good. Airsocket doesn't use a ring buffer, or at least not for that reason, though it's possible that using one when searching for a valid message(instead of shifting the array window) might be a slight performance benefit.

Quiet looks for "mark" and "space" frequencies, but in a bit of a different sense than in Airsocket. My guess, from both reading about modems and scanning the Quiet code, is there is a specified carrier frequency(analogous to the space frequency), and the mark frequency is a range of frequencies that are deviated from the carrier frequency. I'm guessing that range of frequencies is represented in "subcarriers" in this code, but that's really just a guess. It would make sense, as I am guessing they are using a FFT to demodulate the audio signal, so you might want to look at a precise spread of frequencies once you have demodulated the whole spectrum. The Goertzel algorithm, which Airsocket uses, works a bit differently in that you can specify those specific frequencies ahead of time so that the algorithm focuses on only those parts of the spectrum and not the whole thing. However, it becomes more taxing to demodulate a lot of frequencies, so I've stuck with looking at two frequencies and looking at which one has a higher energy value in a small window of time. What I would consider doing is only specifying the space(i.e. carrier) frequency and then the library picks two frequencies above or below it that would represent the min-max frequency biases.

For example, if we had a space frequency of 500hz and a mark frequency of 600hz, but the signal we are receiving is 566hz, it should bias closer to the 600hz "bucket", so a signal should be decodable without the need for looking at a wide range of frequencies. The combination of the Goertzel algorithm and running the decoder in a worker/thread seems to provide a reasonable balance between efficiency and code readability, at least IMO. Browsers do come with native FFT now, but at the cost of me not having a full understanding of what's happening underneath and that feeding it raw sample values is less straight-forward(since it's very coupled with the web audio APIs).

However, you may want to consider the comprehensibleness of each codebase; if what you want is duplex, and you want to accomplish that with quiet-js, it means you are going to have to understand the codebase for quiet and write the duplex code in C, since quiet-js is merely a wrapper around that library using Emscripten. If that's not a problem, then totally go for it. Airsocket, in contrast, was written to be very minimalistic and simple to understand, and in JavaScript(ok, well Coffeescript, but that may change to ES2015). The only component of Airsocket I'd consider moving to C would be the demodulator, as that's a very decoupled component and the decoding logic would be left alone. With quiet, most of the logic involved with encoding and decoding is in the C codebase. Again, if you have experience or a passion for C, that probably isn't an issue.

Also, I just discovered one thing about webaudio in quiet-js that I wasn't aware of, which is that there are vendor-prefixed options for preprocessing input like googNoiseSuppression. https://github.com/quiet/quiet-js/blob/master/quiet.js#L351-L385 Very interesting stuff.

Ravenstine commented 8 years ago

For some background, the "mark" and "space" terminology come from modems that do look at specific frequencies. https://en.wikipedia.org/wiki/Bell_202_modem

It seemed appropriate for a library that also has two distinct frequencies.

derhuerst commented 7 years ago

Hey!

I knew about lib quiet & quiet-js before starting my own pet project ultrasonic-transport. I considered them to be too complex, as I want JS modules to be too simple that you can read through the code. That's why I decided to just naively dive into the topic and built something.

Coming across airsocket, I can see that we can join our efforts to create a simple, portable sonic transport layer. Thanks for the work you did! I came up with the following thoughts:

Ravenstine commented 6 years ago

@derhuerst I regret not coming across your comment until now. If you still have any interest around such a project, I'd love to discuss it.

I agree with an implementation that supports both the browser and Node.js. I'm in the early stages of rewriting AirSocket to not be so coupled with the Web Audio API. I would still want there to be a WebSocket-like library that's as easy to use and uses Web Audio, but the underlying demodulating/decoding implementation could be used on its own.

Audiojs sounds like a great idea! I hadn't seen that project before.

Streams are an interesting idea. I'll have to give that some thought.

Here's what I've been thinking about so far:

derhuerst commented 6 years ago

I regret not coming across your comment until now. If you still have any interest around such a project, I'd love to discuss it.

Yeah, sure! I also haven't done anything in the past months on this topic, but I'm motivated.

I'm in the early stages of rewriting AirSocket to not be so coupled with the Web Audio API.

Can you share your efforts?

I would still want there to be a WebSocket-like library that's as easy to use and uses Web Audio, but the underlying demodulating/decoding implementation could be used on its own.

Yeah. The common way to achieve this is to split it into a low-level module and a high-level one (a good example might be leveldown and levelup).

Audiojs sounds like a great idea! I hadn't seen that project before.

There are some interested subproject, although many of them are half-done. I'm currently watching audiojs/audio-speaker#44 and will have a look at audiojs/web-audio-write soon.

It would be sweet if "sonic peers" could detect corrupted frames(from collisions or noise) and request that frames be resent. This also requires a more complicated frame format or a complicated underlying protocol sent as packets in the data portion of each frame.

Being able to detect and correct "flipped bits" would definitely belong into this lib, but don't you think re-requesting lost packets is a matter of another lib on top? Isn't this basically TCP?

Ravenstine commented 6 years ago

I'll work on making a new branch, although my code isn't exactly useful. But I've been chucking around some ideas around frame format.

The simplest format I could think of is along this line:

Preamble Length Data CRC-24
3 bytes 2 bytes * 3 bytes ~9 microseconds

Length could be eliminated with the padding method I mentioned earlier, which seems worth doing. It would also provide an integrity check if those extra bits, when XOR'd with the rest of the byte, always equates to zero. Then again, I really don't know how to deal with a situation where the delimiter for the end of the frame is missed... the buffer could build up until memory runs out or another delimiter is sent.

I've been wondering if these frames need to have some form of addressing. Assuming transmission is going to be slow, adding 12 bytes for source/destination MAC addresses seems costly for minimalist transmitting. Including addresses would at least make it easier for a listener to ignore frames that are of no interest to it. If a higher-level protocol like IPv6 provides sufficient addressing, then maybe it's not worth having addresses at the frame level.

If addressing were optional, then perhaps there could be a control bit signifying it.

Preamble Control Bit Destination MAC Source MAC Data CRC-24
23 bits 1 bit 6 bytes 6 bytes * 3 bytes

Totally open to any ideas on what direction to take.

Yeah. The common way to achieve this is to split it into a low-level module and a high-level one (a good example might be leveldown and levelup).

Leveldown and Levelup are good analogies. I definitely agree there.

Being able to detect and correct "flipped bits" would definitely belong into this lib, but don't you think re-requesting lost packets is a matter of another lib on top? Isn't this basically TCP?

It's mostly true that a layer like TCP would be solely responsible for packet loss. I was thinking about how Ethernet devices can tell whether there has been a collision and then it will rebroadcast the frame until a collision is not detected(but was incorrect in assuming such a mechanism requires a form of request from the receiver). Then of course there's also things like jam signals, frequency hopping, etc. But you're right in that functionality isn't specific to the frame structure, and packet loss would be handled by something at a level higher. I suppose having a retry mechanism isn't a requirement, but would probably be a nice-to-have for reliability in the most minimal use case where someone wants to send unstructured data in a frame.

By the way, I am by no means formally schooled in networking, so forgive me if my terminology is incorrect or if my basic understanding doesn't take me far enough. :)