measure and optimize RAM consumption of the different components

KonradHoeffner commented 1 year ago

heaptrack lscomplete20143.hdt

Screenshot from 2022-11-18 09-49-06

without wavelet matrix

Screenshot from 2022-11-18 09-54-43

KonradHoeffner commented 1 year ago

qbench2

without wavelet matrix

Screenshot from 2022-11-18 10-29-35

KonradHoeffner commented 1 year ago

peak memory consumption including all indexes right now is 6.5 GB for lscomplete20143, without the index it is just 2.6 GB
peak consumption is significantly higher than after loading, where lscomplete20143 uses 2.6 GB
current bitmap size is unknown because it uses RsDict
the wavelet index only needs 205.5 MB after loading, the huge number of allocations and memory only happens in the preprocessing
op_index is large because it uses 64 bits instead of being compressed to the necessary bits. Right now there is no functionality to compress int vectors, only read already compressed ones from file. Could implement that, however we also could use a library, for example sucds that we already include for the wavelet matrix has a compressed int vector, however we can't export the data from it as a vector of u8. Could manually compress it to 32 bit quite easily but at that point we could just use a 32 bit vector...
compared to the 2.6 GB as read off Heaptrack, hdt.size_in_bytes() reports 2.4 GB, we can thus assume that the bitmaps, where the size is currently unknown, take around 200 MB.

KonradHoeffner commented 1 year ago

HDT size in memory > 2.4 GB, details:
Hdt {
    dict: FourSectDict {
        shared: total size 17.5 MB, sequence 217.6 KB with 69619 entries, 25 bits per entry, packed data 17.2 MB,
        subjects: total size 229.9 MB, sequence 1.2 MB with 333197 entries, 28 bits per entry, packed data 228.7 MB,
        predicates: total size 19.7 KB, sequence 192 B with 102 entries, 15 bits per entry, packed data 19.5 KB,
        objects: total size 443.2 MB, sequence 4.2 MB with 1147322 entries, 29 bits per entry, packed data 439.0 MB,
    },
    triple_sect: Bitmap(
        total size 1.7 GB
        adjlist_y AdjList {
            sequence: 156.6 MB with 113873721 entries, 11 bits per entry,
            bitmap: Bitmap size unknown,
        }
        adjlist_z AdjList {
            sequence: 407.1 MB with 120635813 entries, 27 bits per entry,
            bitmap: Bitmap size unknown,
        }
        op_index AdjList {
            sequence: 965.1 MB with 120635813 entries, 64 bits per entry,
            bitmap: Bitmap size unknown,
        }
        wavelet_y 205.5 MB
        ,
    ),

Why are 102 predicates encoded with 15 bits? They should only need 7. objects should also take around 21 bits not 29. Are there 8 bits too many used everywhere? But it is loaded like this from the hdt-cpp output...

KonradHoeffner commented 1 year ago

Optimization Candidates

[x] op_index 951 MB
[x] generation of wavelet index

KonradHoeffner commented 1 year ago

32 bit OP index sequence

HDT size in memory > 1.9 GB, details:
Hdt {
    dict: FourSectDict {
        shared: total size 17.5 MB, sequence 217.6 KB with 69619 entries, 25 bits per entry, packed data 17.2 MB,
        subjects: total size 229.9 MB, sequence 1.2 MB with 333197 entries, 28 bits per entry, packed data 228.7 MB,
        predicates: total size 19.7 KB, sequence 192 B with 102 entries, 15 bits per entry, packed data 19.5 KB,
        objects: total size 443.2 MB, sequence 4.2 MB with 1147322 entries, 29 bits per entry, packed data 439.0 MB,
    },
    triple_sect: Bitmap(
        total size 1.3 GB
        adjlist_y AdjList {
            sequence: 156.6 MB with 113873721 entries, 11 bits per entry,
            bitmap: Bitmap size unknown,
        }
        adjlist_z AdjList {
            sequence: 407.1 MB with 120635813 entries, 27 bits per entry,
            bitmap: Bitmap size unknown,
        }
        op_index total size 482.5 MB
        sequence 482.5 MB
        bitmapBitmap size unknown

        wavelet_y 205.5 MB
        ,
    ),
}

KonradHoeffner commented 1 year ago

variable bit OP index sequence

HDT size in memory > 1.9 GB, details:
Hdt {
    dict: FourSectDict {
        shared: total size 17.5 MB, sequence 217.6 KB with 69619 entries, 25 bits per entry, packed data 17.2 MB,
        subjects: total size 229.9 MB, sequence 1.2 MB with 333197 entries, 28 bits per entry, packed data 228.7 MB,
        predicates: total size 19.7 KB, sequence 192 B with 102 entries, 15 bits per entry, packed data 19.5 KB,
        objects: total size 443.2 MB, sequence 4.2 MB with 1147322 entries, 29 bits per entry, packed data 439.0 MB,
    },
    triple_sect: Bitmap(
        total size 1.2 GB
        adjlist_y AdjList {
            sequence: 156.6 MB with 113873721 entries, 11 bits per entry,
            bitmap: Bitmap size unknown,
        }
        adjlist_z AdjList {
            sequence: 407.1 MB with 120635813 entries, 27 bits per entry,
            bitmap: Bitmap size unknown,
        }
        op_index total size 407.1 MB
        sequence 3.3 GB with 27 bits
        bitmapBitmap size unknown

        wavelet_y 205.5 MB
        ,
    ),

Screenshot from 2022-11-18 18-03-19

KonradHoeffner commented 1 year ago

With below 2 GB of RAM usage including indexes for a 1.3 GB HDT file I'm now at a point I consider acceptable for now. However the big spike in temporary memory allocation beforehand may cause a server to crash, so this should be dropped to 4 GB or less.

KonradHoeffner commented 1 year ago

Without the indexes it's much less: Screenshot from 2022-11-18 18-13-47

KonradHoeffner commented 1 year ago

Both CompactVector use a slices for building, would it be possible to save space using a consuming iterator? Is that supported?

KonradHoeffner commented 1 year ago

Down to 5.5 GB peak, more next week. Screenshot from 2022-11-18 18-33-39

KonradHoeffner commented 1 year ago

Test if a HashMap needs less space than a BTreeMap. Update: Won't work, vec's aren't hashable. Another update: It's dense, just use a vec of vecs instead of a map.

KonradHoeffner commented 1 year ago

We could reorder the generation statements so that the matrix is generated as early as possible.

KonradHoeffner commented 1 year ago

Much lower peak, only around 800 MB to go! Screenshot from 2022-11-18 18-48-09

KonradHoeffner commented 1 year ago

The big spike comes from sucds::wavelet_matrix::WaveletMatrix::from_ints, opened an issue at https://github.com/kampersanda/sucds/issues/44. Will close this issue and continue when there is a response in that issue, or if not investigate another library later.

KonradHoeffner commented 1 year ago

Brought significantly below 4 GB in #16.

KonradHoeffner / hdt