greg7mdp / parallel-hashmap

A family of header-only, very fast and memory-friendly hashmap and btree containers.
https://greg7mdp.github.io/parallel-hashmap/
Apache License 2.0
2.53k stars 239 forks source link

issue under mac? #239

Open gf777 opened 5 months ago

gf777 commented 5 months ago

Hi @greg7mdp You helped me a few months ago implementing parallel joining of maps: https://github.com/greg7mdp/parallel-hashmap/issues/234 This was working fine, but recently I started having trouble under Mac. As a matter of fact the process seems to be working fine, but I get some warning and test failures so I am wondering if this could be highlighting something deeper. Basically on my mac (Sonoma 14.2) I pass the test but in github actions it fails: https://github.com/vgl-hub/kreeq/actions/runs/9065660799/job/24906743274#step:4:62 If I revert to macos-13 in github action it passes the test. Locally I get this warning when the test is run though:

gformenti@macbook-pro-25:~/Documents/GitHub/kreeq$ kreeq union -d testFiles/test1.kreeq testFiles/test2.kreeq 
submap count(0) != N(8)
submap count(0) != N(8)
DBG Summary statistics:
Total kmers: 1572
Unique kmers: 13
Distinct kmers: 115
Missing kmers: 4398046510989
Total edges: 196

Do you have any clue as what might be causing this? Maybe it is just a bug in my implementation, which is here: https://github.com/vgl-hub/kreeq/blob/int32/src/graph-builder.cpp#L860 Many thanks and all the best Giulio

greg7mdp commented 5 months ago

Hi @gf777 , I'm away from home and I can't have a good look at your code right now. I believe there should not be any problem accessing map1 and map2 like you do. I wonder about map32 though. Just from a quick look, this looks strange, shouldn't it be (*maps32)[m]?

I'll look at your code in more detail when I get home.

gf777 commented 5 months ago

Hi Gregory Thank you so much for looking into this! So, initially I also thought it had to do with new commits, particularly the new map32 that you are referring to (I spent quite some time trying to debug it!). For context, map32 is a larger version of the standard map used to store counts that exceed 255. However, I realized that the issue was affecting the main branch as well, which wasn't updated in the past two months. When I ran a new check there it started to give me the same issue, suggesting me that a change in the macos version used for github actions was the culprit. See https://github.com/vgl-hub/kreeq/commits/main/ commits 16c61db to 1b3369d This seems to overlap with github actions transitioning from macos 12 to 14 in the last two months: https://github.com/actions/runner-images/issues/9255

gf777 commented 5 months ago

ps. dereferencing the individual map in the maps vector should be ok because this is how the vector is defined https://github.com/vgl-hub/kreeq/blob/int32/include/kreeq.h#L68

greg7mdp commented 5 months ago

ps. dereferencing the individual map in the maps vector should be ok because this is how the vector is defined https://github.com/vgl-hub/kreeq/blob/int32/include/kreeq.h#L68

Makes sense. I'll do a more careful code review when I'm back from vacation.

greg7mdp commented 5 months ago

Don't you access in read/write the same parallelMap32 (maps32[m]) from multiple threads without a mutex protection?

gf777 commented 5 months ago

Hi @greg7mdp That shouldn't be the case. Do you have a particular line in mind? The logic is the following: I create a vector of m maps (maps32). Then I have a job scheduler that runs m jobs (potentially concurrently, one per thread). So maps are always updated by different jobs. Afaik counts are always correct (so there appears to be no data race). This is true even with the submap count(0) != N(8) warning under MacOS. The warning is only under Mac 14 (Mac13, linux and win are fine).