cmuparlay / parlaylib

A Toolkit for Programming Parallel Algorithms on Shared-Memory Multicore Machines
MIT License
320 stars 60 forks source link

Hash map from `group_by_key()` #74

Closed WhateverLiu closed 4 months ago

WhateverLiu commented 4 months ago

The group_by_key() function is so awesome. I am wondering if there's a similar function that will return a hash map alike structure rather than a sequence of pairs. I am in a situation where I need to query the output from group_by_keys() using new keys. I've implemented a speedy transformation, but am wondering if I can intercept the library function at some point, and extract the hash table before it gets flattened to sequence?

gblelloch commented 4 months ago

It would be hard to "extract the hash table before it gets flattened". It is implemented as a two level hash approach. The first buckets into about sqrt(n) buckets, in paralle, and the second uses a sequential hash table for each bucket. At the end it is all flattened.

If you want a concurrent hash table you might consider https://github.com/cmuparlay/parlayhash It is not as fast as just doing a group by to insert all entries, but I guess after doing the group-by you could use it on each key-value pairs created by group_by. This would be quite effective if the number of keys is noticeably smaller than the number of elements.

Guy

On Thu, Jul 11, 2024 at 2:27 AM CharlieWL @.***> wrote:

The group_by_key() function is so awesome. I am wondering if there's a similar function that will return a hash map alike structure rather than a sequence of pairs. I am in a situation where I need to query the output from group_by_keys() using new keys. I've implemented a speedy transformation, but am wondering if I can intercept the library function at some point, and extract the hash table before it gets flattened to sequence?

— Reply to this email directly, view it on GitHub https://github.com/cmuparlay/parlaylib/issues/74, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTIVAHFKRH4Y26KJ7ULW5LZLYQVBAVCNFSM6AAAAABKWJF5QGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQYDEMZVHA2DQNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>