andyferris / Dictionaries.jl

An alternative interface for dictionaries in Julia, for improved productivity and performance
Other
282 stars 28 forks source link

pre-allocate storage before index() #32

Closed jdonnerstag closed 3 years ago

jdonnerstag commented 3 years ago

Hi,

Use case: we create an index/dict over the content of a file. The file has a header with the number of records. Could be 1.000, 1mio or 100mio, whatever. Currently we are using index() to iteratively scan each record, determine the key and create a key->record entry in the dict. For large files with many records, that probably means that the Dictionary internal data structures need to grow on demand. In our use case, I know the size (number of keys) upfront. Can I somehow use index() with pre-allocate structures?

thanks a lot

andyferris commented 3 years ago

Hi @jdonnerstag - sorry for the delay, I missed your message.

Yes, you can construct a Dictionary using the sizehint keyword, and then iterate over your the recods in the file and insert! them into the dictionary.

dict = Dictionary{String, Record}(; sizehint = 1_000_000)
for (key, record) in file
    insert!(dict, key, record)
end

Does that make sense?

jdonnerstag commented 3 years ago

Yes, thanks a lot