eclipse-theia / theia

Eclipse Theia is a cloud & desktop IDE framework implemented in TypeScript.
http://theia-ide.org
Eclipse Public License 2.0
20.06k stars 2.5k forks source link

Improve performance of JSON-RPC communication #10684

Closed tortmayr closed 2 years ago

tortmayr commented 2 years ago

Theia relies heavily on JSON-RPC for exchanging messages between frontend and backend processes. Messages are often routed multiple times through different processes (Theia frontend, Theia backend, plugin processes) therefore it is essential that the communication layer (i.e. JSON-RPC protocols) is implemented in an efficient and performant fashion.

Currently there is a lot of momentum in this topic to significantly improve the communication layer in both architecture and performance. We collected and analyzed the currently available information which is distributed over numerous issues, pull requests and POC branches. Based on that information we distilled a 5 step development plan to coordinate the overall changes.

Analysis

Recently it has been reported that the performance of the Filesystem API for Theia plugins is significantly worse than the performance of its VS Code counterpart (https://github.com/eclipse-theia/theia/issues/9514). According to the issue creator, in this particular case the loading times increase by the factor 100. After investigations led by @tsmaeder it was concluded that the performance issues are not directly related to the Filesystem API. The performance hit really occurs in the JSON-RPC communication layer when encoding large chunks of binary data (i.e. files).

Currently Theia has two different JSON-RPC message protocols in place. For Theia backend-frontend services vscode-ws-jsonrpc is used and for plugin communication a custom protocol implementation is used. Both encode the message data as strings and additionally wrap them into dedicated message objects. This is problematic because encoding binary data to strings is very inefficient, for example the message size alone already increases by a factor of 5. In addition, messages are often wrapped and encoded multiple times on the way to their destination.

To increase performance the json-rpc protocols need to be refactored to:

The RPC protocols have been recently refactored to be independent of plugin concerns (https://github.com/eclipse-theia/theia/pull/8972) which means the code base is now in a good state for further improvements to the RPC protocol infrastructure.

In this process we should probably get rid of any usage of vscode-ws-jsonrpc. This library is a custom fork of vscode-jsonrpc to add web socket support and is no longer maintained. It has been concluded that the way to go forward is to replace it with an in-repo alternative instead of consuming an updated version or an alternative library (see also https://github.com/eclipse-theia/theia/pull/10265).

@tsmaeder has already worked on a first attempt to refactor the Theia Messaging RPC protocol to a write-through RPC framework (https://github.com/tsmaeder/message-rpc). This could serve as a base for any future efforts.

Note that another recent effort has been made to improve the communication layer by switching from simple websockets to socket.io (https://github.com/eclipse-theia/theia/pull/10514). However, strictly speaking this change only touches the transportation layer and does not directly affect the json-rpc communication layer on top.

Development plan

Based on the analysis we have identified the following tasks to increase the overall performance of messaging via json-rpc.

tsmaeder commented 2 years ago

I'm really looking forward to seeing this realized: while I looked at this problem for large messages, I suspect that everything might just get a little snappier ("Kleinvieh macht auch Mist" ;-)) with faster remote calls.

I would urge you to treat the stuff in https://github.com/tsmaeder/message-rpc with little respect: I believe the concepts are correct, but I would welcome a constructive tearing down of that code. In particular around connection lifecycle and error state handling.

tsmaeder commented 2 years ago

@tortmayr what's the way forward with this issue? Is there a plan to address it from your side?

MatthewKhouzam commented 2 years ago

This may be off topic, but this masters project may be interesting as a reference. https://pdfs.semanticscholar.org/2def/59816c77b7f73ffd47ea0c59a0fcea96e8d2.pdf

thegecko commented 2 years ago

I played with gRPC with a view to using it with Theia a few years ago

JonasHelming commented 2 years ago

@tortmayr what's the way forward with this issue? Is there a plan to address it from your side?

Yes, we started to work on this

tortmayr commented 2 years ago

So after spending some time on this issue it’s time for the first update:

In the first step we have set up a simple test infrastructure that allows us to measure the performance of RPC calls in the different deployment scenarios of Theia (Theia extension/plugin in browser or electron applications). Since we are mostly interested in the performance when sending large messages we are using the FileSystem API and are measuring the time it takes to read files from the local file system. The test suite has been executed with the current master to establish a baseline and identify potential issues. In addition, it can be executed at any time during development to immediately see the impact of any potential improvements we make. Second, we have integrated the message-RPC PoC provided by @tsmaeder into Theia’s core extension messaging API. We are aware that the PoC is far from production ready and that a constructive code tear down is needed, however, an early integration provides two advantages: First we can see whether the proposed message protocol provides the expected performance improvements and second we now have a practical example which we can use for testing any further improvement made to the message-RPC PoC.

Performance Measurements

We have implemented a simple VS Code extension and a simple Theia extension that use the respective file system API (either vscode.workspace.fs or the FileService) to read files from the local filesystem. This allows us to measure and record the performance in the following execution contexts:

In addition, to identify whether there are issues in the FileService implementation of Theia itself, we added a complementary test for extensions that loads files by directly using the node fs API in the backend.

For testing 6 files of different sample sizes (1K, 5K, 50K, 500K, 5000K, 10000K) have been used. To avoid statistical errors each file loading test is executed multiple times (15) and the median of the measured reading time is calculated.

Results

On the current master executing the performance test suite (with a Intel® Core™ i7-10850H CPU) leads the following results :

Theia Plugin/VS Code Extension Performance:

Test scenario 1K 5K 50K 500K 5000K 10000K
VS Code 3 ms 4 ms 5 ms 11 ms 32 ms 67 ms
Electron 7 ms 7 ms 23 ms 189 ms 2034 ms 4138 ms
Browser 12 ms 9 ms 27 ms 226 ms 2476 ms 5025 ms
Browser vs. VS Code 1K 5K 50K 500K 5000K 10000K
Time 9 ms 5 ms 22 ms 215 ms 2444 ms 4958 ms
Factor 4.00 2.25 5.40 20.55 77.38 75.00
Browser vs. Electron 1K 5K 50K 500K 5000K 10000K
Time 5 ms 2 ms 4 ms 37 ms 442 ms 887 ms
Factor 1.71 1.29 1.17 1.20 1.22 1.21

chart (2)

The results seem to be consistent with what has been reported in https://github.com/eclipse-theia/theia/issues/9514. The performance of reading binary files with Theia's file system implementation is significantly worse than what we experience in VS Code. The reading times seem to grow linear with file size so while the issue is negligible for smaller files it becomes very prominent with files larger > 500K.

In the VS Code context the file content is sent one time through a websocket connection whereas in the Theia plugin context we are sending the file three times (backend-> browser -> backend end -> plugin host). So we’d expect that the lower bound for what we could reach in Theia is ~3x the VSCode performance.

Theia Extension Performance:

Test scenario 1K 5K 50K 500K 5000K 10000K
Electron (FileService) 5 ms 3 ms 8 ms 71 ms 793 ms 1534 ms
Browser (FileService) 4 ms 4 ms 9 ms 78 ms 806 ms 1687 ms
Electron (Node FS) 5 ms 3 ms 8 ms 66 ms 684 ms 1431 ms
Browser (Node FS) 5 ms 3 ms 8 ms 63 ms 650 ms 1380 ms
Browser vs VS Code 1K 5K 50K 500K 5000K 10000K
Time (ms) 1.00 0.00 4.00 67.00 774.00 1620.00
Factor 1.67 0.75 1.60 6.45 24.78 22.90
Browser vs Electron (FileService) 1K 5K 50K 500K 5000K 10000K
Time (ms) -1 1 1 7 13 153
Factor 0.80 1.33 1.13 1.10 1.02 1.10
Browser Vs Node FS 1K 5K 50K 500K 5000K 10000K
Time (ms) -1 1 1 15 156 307
Factor 0.80 1.33 1.13 1.24 1.24 1.22

chart (1)

The performance degradation can be also experienced in a pure Theia extension context. Here the expected performance should be fairly close to what we get in VSCode. In both cases the file content is sent one time through a websocket.

We can also experience that the performance in electron is slightly better which is probably caused by the fact that we use electron’s IPC protocol here instead of websocket for frontend-backend communication.

The complementary test that directly uses the node FS API in the backend shows that there seems to be no performance related issue in Theia's filesystem implementation itself. As expected the file system implementation has a little bit of overhead compared to using pure node (~ x 0.24) but this factor does not change for larger files.

As investigated by @tsmaeder most of the performance issues can be traced back to the way Theia currently handles binary data with the vsode-ws-json RPC-protocol, so major improvements are expected once the switch to the new message-RPC protocol is completed.

Message-RPC PoC integration

We have integrated the message-RPC PoC into Theia’s core messaging API and the new protocol is used for all frontend-backend communication. A WIP PR showing the current integration state can be found here: https://github.com/eclipse-theia/theia/pull/10809.

Please note that this code is not yet ready to be reviewed and should not be used in production. At the moment we are mainly focused on the integration into browser applications and have not tested the electron use cases yet.

An initial test with this message-RPC integration the browser extension Node-FS tests yields promising results. We are able to ready a 10MB file in 56 ms. So compared to the current master this is a 25x performance gain. If we add the measured 0.24x overhead for Theia's FileService we can assume that loading with the FileService API would take 69 ms which is inline with what we get in VS Code.

While the new protocol works well for binary buffers we have discovered some major deficits when encoding pure (buffer-free) JSON objects. Compared to encoding with JSON.stringify() the resulting binary buffers are much larger. To test this we used a simple randomly generate json test object:

{
    'curve': 'yes',
    'successful': false,
    'does': [
        [
            'tool',
            'strange',
            'declared',
            false,
            'if',
            false,
            false,
            true,
            true,
            196639994
        ],
        -1697924638.043861,
        1921422646,
        'hide',
        false,
        true,
        true,
        -400170969,
        550424783,
        -2118374202.4598904
    ],
    'fish': 664495385.6069336,
    'eat': -1205575089,
    'boat': 1495629676,
    'arm': 'nation',
    'height': false,
    'underline': 'have',
    'satellites': -20686813.87966633
};

and measured the encoding/decoding time as well as the final buffer size. Additionally, we tested with an object array that contains 100 test object entries and a simple string array containing 1000 string entries:

Stringify Encode Test object Test object Array (size 100) Simple string array size (1000)
Encoding Time 1 1 1
Decoding Time 1 1 1
Byte length 352 35301 5891
MessageEncoder Test object Test object array (size 100) Simple string array size (1000)
Encoding Time 1 14 5
Decoding Time 1 28 4
Byte length 577 57708 12898
Overhead 225 22407 7007
Factor 1.64 1.63 2.19

As we can see the new encoder produces significantly larger buffers for the same object and the overall time it takes to encode/decode simple JSON objects also increases. This has also direct consequences for Theia RCP calls that send large JSON object arrays e.g. the HostedPluginServer.getDeployedPlugins() call.

While we highly appreciate the increased performance for binary buffers it should not come with any major performance degradation for plain JSON messages. Probably an alternative encoding approach combining JSON.stringify() with a special handling for buffers might be better here. It seems like VS Code is also doing something similar (https://github.com/microsoft/vscode/blob/435f8a4cae52fc9850766af92d5df3c492f59341/src/vs/workbench/services/extensions/common/rpcProtocol.ts#L43). @tsmaeder we would highly appreciate your input here.

What’s next

Based on this initial findings we plan to move forward as follows:

thegecko commented 2 years ago

Were the VS Code tests using the desktop version or web (codespaces/vscode.dev) version?

tortmayr commented 2 years ago

Hi @thegecko the we did the VS Code tests using the desktop version. I could also try to setup the test suite in codespaces but I'd expect similar results (maybe a little longer reading times because the performance in the cloud container is limited compared to my local machine).

thegecko commented 2 years ago

Thanks, if you expect similar results I wouldn't want you to spend much more time on it :)

tsmaeder commented 2 years ago

While we highly appreciate the increased performance for binary buffers it should not come with any major performance degradation for plain JSON messages. Probably an alternative encoding approach combining JSON.stringify() with a special handling for buffers might be better here. It seems like VS Code is also doing something similar (https://github.com/microsoft/vscode/blob/435f8a4cae52fc9850766af92d5df3c492f59341/src/vs/workbench/services/extensions/common/rpcProtocol.ts#L43). @tsmaeder we would highly appreciate your input here.

It should not be the case that the encoded size is larger for the new implementation, in fact, I would expect the inverse. In my experience, it all depends on what you put into string: did you try with mostly ascii content? Anyway, the only thing in the design of the encoder that might add something longer than Json.stringify() is the need for type tags. it's like:

<arraytagnumber>,length,<stringtagnumber>bytes1.... ,<stringtagnumber>bytes2

vs

[
   "string1",
   "string2"
]

Unless we're screwing up in string encoding or it's a degenerated example (I letter string, etc.) this should not add significant overhead. Therefore I suspect there is a bug in the implementation. Do you have info on what makes the encoded output so much bigger?

Also, since you need a byte array for sending over the socket in the end, even if you JSON.stringify(), you still need to encode the resulting string. How are you doing that?

tortmayr commented 2 years ago

Also, since you need a byte array for sending over the socket in the end, even if you JSON.stringify(), you still need to encode the resulting string. How are you doing that?

I used nodes Buffer.from() method to encode the string to a binary buffer.

Using your example from above I get the following results:

const toEncode = ['string1', 'string2'];

// Encode with stringify 
const stringified = JSON.stringify(toEncode);
const encoded1 = Buffer.from(stringified);
// then encoded1.byteLength= 21

// Encode with MessageEncoder
const encoder = new MessageEncoder();
const writer = new ArrayBufferWriteBuffer();
encoder.writeTypedValue(writer, toEncode);
const encoded2 = writer.getCurrentContents();
// then encoded2.byteLength=  42
// so factor 2 in this case
}

Trying to reconstruct the output of the MessageEncoder manually I would expect a byte-length of 34:

Write Length Value
<encoderId> 4 2 (ObjectArray)
<length> 4 2
<encoderId> 4 0 (JSON)
<bytes> 9 "string1"
<encoderId> 4 0 (JSON)
<bytes> 9 "string2"
34

So I have to investigate were the additional 8 bytes are coming from. However, looking at this manual deconstruction I think we could already signficantly reduce the overhead if we serialize the encoder-id as Uint8 instead of Uint32. This means we would save 3 bytes per encoded id and I highly doubt that we need the possibility to register more than 255 different value encoders.

tsmaeder commented 2 years ago

if we serialize the encoder-id as Uint8 instead of Uint32

@tortmayr Absolutely, I was going to suggest that. We could even dynamically use 1 or more bytes to encode the tags depending on the number of encoders: since both sides of the channel need to have the same encoders, they would come to the same decision.

We could get further improvements by registering encoders for numbers and boolean values: right now we're converting them to strings, which is, for example 5 bytes for 'false' where one byte would be enough. We can stretch that concept to small numbers: if it's an integer less than 256 use 'byte' as a tag, then 'short' and 'long'. Many numbers probably are relatively small integers. Same goes for strings: if we have a custom encoder, we can save two bytes for the quotes.

Applying those to the string array example, I end up with 21 bytes, which is exactly the same as JSON-encoding without formatting.

I used nodes Buffer.from() method to encode the string to a binary buffer.

I'm using TextEncoder to encode strings to bytes in message-rpc. We should use the same method to get comparable results.

tsmaeder commented 2 years ago

In reality, you get 29 bytes instead of 21, the length field of the strings is missing in the table above, adding 8 bytes for two strings.

tsmaeder commented 2 years ago

But we can play more games: since probably most length fields are less than 32k, we could use 15 bits for the length field, with the 16th bit indicating that this is a "long size" that uses 4 bytes for length instead of 2. That would allow lengths of up to 2 billions entries (which is likely going to crash and burn our browser anyway).

tsmaeder commented 2 years ago

More generally, let's think of arrays and objects: Arrays first:

in JSON, we have an overhead of two brackets and a command after each entry, thus the over head is 2+-1. If we encode the length, we have + length, which is currently 4+4 = 8. So any array larger than 7 entries is going to be shorter using length-encoding. If we use the above optimizations, the break-even should occur at 2+1= 3, so any array larger than one element would profit.

The same argument applies, IMO to objects: two braces and a separator between fields.

The same argument applies to Strings: if we use 2 bytes for length, the space we're using is the same as using quotes, but we might have to escape quotes inside the string if not length-encoding.

tortmayr commented 2 years ago

@tsmaeder Thanks for your fast feedback. Your suggested improvements should significantly reduce the final buffer size.

I'm using TextEncoder to encode strings to bytes in message-rpc. We should use the same method to get comparable results.

I have updated my test cases to also use TextEncoder. It doesn't have much impact on the results (apart from slightly longer encoding times for the Stringify encoding) but aligning the encoding strategy definitely makes sense.

In reality, you get 29 bytes instead of 21, the length field of the strings is missing in the table above, adding 8 bytes for two strings.

You are right. I completely missed the string length encoding in the example table.

tsmaeder commented 2 years ago

While I'm confident that we can get the encoded size down, I'm a bit disappointed about the encoding/decoding time. I would not have guessed those to be significantly larger, but I guess you can't compete with native code.

tortmayr commented 2 years ago

Yes I agree, I'm also a little bit concerned about the increased encoding/decoding times. FYI: I have implemented an "enhanced" message-encoder based on your suggestions and also implemented an alternative approach using JSON.stringify() with a special replacer for buffers to encode objects. Today I'm going to run some tests and measurements to compare the performance of the different approaches ("pure-stringify", "stringify-with-buffer-handling", "custom message encoder/decoder"). I guess I will be able to present the results tomorrow and we then can decide how to best move forward with this issue.

tsmaeder commented 2 years ago

One thing we should keep in mind is that whatever solution we end up with needs to be safe for tunneling a rpc call through an rpc call: we tunnel messages from the front end to the plugin host through a back end service: so if we go with "json with buffers on the side", we need to be able to keep the buffers in the tunneled messages and not stringify them.

Besides the micro-benchmarks, we should also look at the behaviour in Theia itself: unless we have a good statistical model of what messages occur in real life, benchmarks can be misleading.

paul-marechal commented 2 years ago

we tunnel messages from the front end to the plugin host through a back end service

Instead of going frontend(rpc1) --> (rpc1)backend(rpc2) --> (rpc2)plugin-host could we instead open a passthrough connection? Having frontend(rpc3) --> (pipe)backend(pipe) --> (rpc3)plugin-host? Unless there is middleware logic required on the backend?

tsmaeder commented 2 years ago

We need to route the packets to a particular plugin host (can be different in Che). Anyway...I like stuff that is composable better: one less restriction to remember.

paul-marechal commented 2 years ago

Routing params can be passed upon connection creation to know which plugin-host to route to.

I like stuff that is composable better: one less restriction to remember.

You mean having a RPC protocol that supports nested RPC payloads without too much overhead?

tsmaeder commented 2 years ago

You mean having a RPC protocol that supports nested RPC payloads without too much overhead?

Yes: I don't want to have have to take care what I send as a message parameter: it's one less thing to get right.

tortmayr commented 2 years ago

As we are near completion of Step 3 of our initial development plan (Integrate new PRC framework with Theia message API and replace vscode-ws-jsonrpc) it is time for another update.

The new message protocol is now fully integrated into Theia`s core messaging API. All messaging connections (websocket, IPC, custom connections like the channel between TerminalFrontendWidget and the underlying terminal process) are now exclusively communicating over the new protocol.

Following the discussion regarding performance issues with plain JSON messages we mainly focused on refactoring and improving the encoding & decoding performance of the new RPC protocol. We came up with a solution that prefers string encoding for JSON objects when ever possible (i.e. the object doesn't contain binary data). This means the encoding performance for plain JSON messages is more or less equal to what we are currently experiencing in Theia whilst encoding and decoding messages that contain binary data has become significantly faster. In addition, the message protocol has been hardened against issues that can occur with different runtime specific buffer implementations. For instance a nodeJS Buffer might be using a shared buffer pool which is not transferable. The refactored protocol as utilities in place to catch such problematic buffers and converts them into a transferable format before encoding.

Anyone interested in the current state can checkout the updated WIP PR (#10809). Please note that this code is not yet ready to be reviewed and should not be used in production.

The WIP PR does not yet contain the migration of the electron IPC channels to the new RPC protocol. We still have to some testing to ensure that the new RPC protocol works as well in Electrons apps as it already does in Browser apps. Once this is done we will provide a ready-to-review PR.

Performance Benchmarks

Benchmarks with the previously established performance test suite rend amazing results. The test suite as been executed with both the current master and the PR state to provide comparable results:

File loading performance comparison (Browser)

Test scenario 1K 5K 50K 500K 5000K 10000K
Extension FileService (Master) 4 ms 5 ms 15 ms 71 ms 767 ms 1752 ms
Extension FileService (PR) 4 ms 5 ms 9 ms 17 ms 41 ms 56 ms
Extension Node FS (Master) 3 ms 4 ms 13 ms 60 ms 697 ms 1239 ms
Extension Node FS (PR) 2 ms 5 ms 7 ms 15 ms 35 ms 58 ms
Plugin VS Code FS (Master) 8 ms 9 ms 27 ms 225 ms 2476 ms 5025 ms
Plugin VS Code FS (PR) 5 ms 7 ms 28 ms 133 ms 1495 ms 3089 ms
Directly in VS Code 3 ms 4 ms 5 ms 9 ms 37 ms 74 ms

As we can see he performance for file loading within a Theia extension is now en par with or even better than what VS Code achieves. The performance for file loading from a plugin also has been improved by quite a bit. Further improvements are expected once the Plugin PRC protocol is aligned with the message RPC protocol and we can directly tunnel messages without having to encode/decode them in between.

The performance tests use the FileService.readFile method which per default reads files unbuffered (i.e. not as stream). We also did an additional test to compare the performance beween unbuffered and streamed reading. Using a file stream reading a the 10000K file takes 280 ms. This means we the new encoding approach unbuffered reading is definitely the better options and should be used whenever possible.

To measure the performance for plain JSON messages we reused the startup performance script. This serves as a good indicator because on application start a lot of RPC messages are dispatched. Luckily the measure startup performance in the master and the PR state is consistently the same. This means we managed to heavily improve the performance for handling binary messages without any major drawbackcs on the handling normal JSON messages.

What’s next

Based on this initial findings we plan to move forward as follows: