feature request: asynchronous decode

idupree commented 9 years ago

I'm sending large protobuf files[0] to a client with a so-so CPU[1]. Calling decode() on the arraybuffer received by AJAX is taking around 1/3 second in my smaller test cases. This causes visible UI lag. It would help if there was an async version of decode() that broke up its decoding into smaller pieces separated by setTimeout(..., 0), and called a callback when it finished. I don't need the parsing to be super fast, just the UI. I can provide more details if it'd be helpful.

[0] 0.1 to 1 MB [1] a Chromebook with an ARM CPU

dcodeIO commented 9 years ago

Is it possible for you to profile the decoding step on the client, finding out what parts of the library are the bottleneck in your specific setup? Before thinking about the addition of asynchronous methods for performance reasons, we should probably first try to make what we have as fast as possible.

Could, for example, be related to https://github.com/dcodeIO/bytebuffer.js/issues/60

idupree commented 9 years ago

I just ran the Chrome profiler on it. I'm not great at reading the profile but I think the Unicode decoding is only taking up about 10% of the decode time. (Which sounds reasonable - I'm decoding a GTFS-realtime vehicle location feed, which has a lot of both numbers and short strings (1 to 8 ASCII characters).) No function really stands out as having a lot of self CPU usage, although the nested function calls from ProtoBuf.Reflect.{Message,Field,Element}Prototype.decode collectively take up about half the self time. (About 2/3 of that is in ProtoBuf.Reflect.MessagePrototype.decode, adding up to 1/3 the total self time usage.)

I'm not seeing anything obvious that could give me the factor of 10 to 100 performance improvement I'd ideally have for preventing UI latency. Though of course improvements are always an improvement.

I looked into running decode() in a Web Worker, but apparently there's no way to send data from a web worker to the main thread besides postMessage(string). [EDIT: [1]] The string would then need to be parsed in the main thread, defeating the purpose of parsing in a web worker (I found lots of threads on the internet lamenting that JSON.parse() doesn't have an async API). I could post a zillion separate messages each with one record of data (sadly in string form), which seems kind of roundabout. The main thread needs to draw something on SVG and/or canvas for each record, which can't be done in a web worker, so the data does need to get to the main thread somehow. (Side note: Chrome is amazingly efficient at displaying SVGs with tens of thousands of elements in them.)

(Something that might be interesting with an async decode implementation -- though I have no idea if this even makes sense for protobufs -- is to be able to decode data incrementally as it arrives, and be able to the use that data before the complete data has arrived.)

[1] MDN says postMessage can take structured data, as well as [in every browser but IE] transfer ArrayBuffers, but other internet sources don't think that's an option. Apparently postMessage with structured data is/was partially broken in many browsers. Nobody has done a thorough test and census of which browsers are broken in what ways. This is the best info I've found: https://github.com/Modernizr/Modernizr/issues/388#issuecomment-32247814

rektide commented 8 years ago

Really interesting notes on postMessage + ArrayBuffers. ArrayBuffers are definitely meant to be a transferable! (the other noted transferable being MessagePorts), so it'd be good to start collecting bug reports against the browsers where this functionality isn't in tip top shape. Super interesting project it sounds like you've got! Best luck, hope webworker offloading happens for you & makes it faster.

brianchirls commented 8 years ago

:+1: for incremental loading as data arrives as well as breaking the process up into asynchronous chunks. It would be useful to be able to defer any further parsing if either there isn't enough data buffered or if enough time has passed since the last frame render. That way you could process incoming data while maintaining a required frame rate for UI updates.

Here's an example of such an approach, using similar utf8 unpacking code (not proto buffers) for loading 3D models. This doesn't check for time elapsed, but it wouldn't be hard to add. https://github.com/mrdoob/three.js/blob/master/examples/js/loaders/UTF8Loader.js#L544

dcodeIO commented 7 years ago

Closing this for now.

protobuf.js 6.0.0

Feel free to send a pull request if this is still a requirement.

protobufjs / protobuf.js

feature request: asynchronous decode #345