orientechnologies / orientdb

OrientDB is the most versatile DBMS supporting Graph, Document, Reactive, Full-Text and Geospatial models in one Multi-Model product. OrientDB can run distributed (Multi-Master), supports SQL, ACID Transactions, Full-Text indexing and Reactive Queries.
https://orientdb.dev
Apache License 2.0
4.75k stars 871 forks source link

Server uses extreme amounts of memory for the size of data #7112

Closed kived closed 3 years ago

kived commented 7 years ago

OrientDB Version, operating system, or hardware.

v.2.1.25 (build 2.1.x@r2e5a34b5d6b0cce75cd5771f3d79ae42c60ba266; 2016-11-02 07:09:21+0000)

Operating System

Expected behavior and actual behavior

I have a simple schemaless cluster with data containing two implicit properties - one is detected as STRING and the other as EMBEDDED. The EMBEDDED portion of the record is quite large (~1300 lines when pretty printed). With 13016 records in the cluster, it is not possible to SELECT FROM cluster with the available memory (tried both Studio and Console). But what is of note is that the memory used by Server is FAR more than the actual data. I can export the data using Console and plocal, and the resulting JSON file is (uncompressed) 461MB. However, Server OOMs with a heap size of 3072MB (more than SIX times the memory required for holding the serialized result). This is after a fresh restart, and with no other appreciable load on the Server. I understand that this may be somewhat resolved in 3.0 (#2425), assuming the Studio and Console clients are updated to use cursors as well, but it is still concerning that Server requires so much more memory than the actual data involved.

luigidellaquila commented 7 years ago

Hi @kived

Thank you for reporting. As you pointed out, we are working on it in v 3.0. The console for sure will be updated to use cursors, but for Studio it's a bit more complicated, because it uses JSON, so the whole result has to be serialized and sent to the client, so even if we manage to stream the JSON we will still have problems on the web page.

What I can suggest you as an immediate work-around is to use SKIP/LIMIT to paginate the query.

About memory consumption in general, we are also working to reduce it, especially in result sets, so you can expect some improvements on that as well

Thanks

Luigi