Closed jessa0 closed 1 year ago
PTAL if ya wanna, @michaelkirk
To help me understand better, can you clarify what you expect the entry point into this module to be for integrators? Or is there an example integration somewhere already?
Ah, I meant to add an example to the documentation, but forgot. libsignal
is going to use this library via the Sanitizer
methods, and will look something like how the new mp4san::sanitize()
function is implemented, but in Java/Swift. Users who can just pass an io::Read + io::Seek
can just use mp4san::sanitize()
directly.
can you clarify what you expect the entry point into this module to be for integrators?
[ ...]
libsignal is going to use this library via the Sanitizer methods, and will look something like how the new mp4san::sanitize() function is implemented, but in Java/Swift.
Ok yes, that was my confusion - so mp4san::sanitize()
is sort of vestigial at this point, in that no client will use it directly, but probably it's helpful to keep it as an example for integrators and for testing.
A notable alternative to the
SanitizerState
API and having to track skipped bytes would be to makemp4san::sanitize()
async, but that would require us to heavily modify or re-implement themp4
crate, as it is currently only synchronous.
This thought might be worth revisiting, in light of #4. Though going async might end up making #5 more complicated?
I'm gonna go ahead and drop this PR, as dropping the mp4
crate (see issues #4 / #5) means we don't have to implement the complicated buffer and "skipped input tracking" logic. I also believe, after going through the experiment of writing this PR, that we'd be better off implementing the callbacks in libsignal than implementing a callback-less API.
This PR adds an API to the sanitizer which has no callbacks or blocking, to accommodate libsignal's FFI bridge.
Libsignal's FFI bridge uses a system of conventions and Rust macros to expose objects and methods on those objects to Swift, Java, and Node. There are are small number of calls back into Swift, Java, and Node from Rust, which are bridged by hand using function pointers, which seems like we want to avoid doing for
mp4san
if possible as it seems error-prone. The currentmp4san
API on master contains callbacks in the form of thestd::io::Read
andstd::io::Seek
traits, so this PR aims to eliminate those by instead 1) allowing the caller to feed data chunks in directly, and 2) requesting the caller to perform seek operations in its return value,SanitizerState
. This is only meant to serve as a low-level API for the FFI bridge. The highel-level API exposed from thelibsignal
Java lib, for example, would look more like:However, most the complexity anyway in this PR lies in tracking "skipped" input (i.e. input which we want to allow the caller to skip or seek over since we don't need it), which we would have to implement in any case in order to facilitate sanitizing a streaming input, which happens for example in some methods of sending media through Signal on Android. This is encapsulated in the
buffer
module. Currently onlymdat
andfree
boxes are signaled to the caller to be skipped over.A notable alternative to the
SanitizerState
API and having to track skipped bytes would be to makemp4san::sanitize()
async, but that would require us to heavily modify or re-implement themp4
crate, as it is currently only synchronous.