HDFGroup / hcl

Hermes Container Library
Other
5 stars 1 forks source link

Fix for Issue 5 #7

Closed hariharan-devarajan closed 3 years ago

hariharan-devarajan commented 3 years ago

Hi Chris,

This is the push contains fix for issue 5. It contains the changes required on map for allowing custom container allocation for dynamic or variable data types.

Please check and let me know.

ChristopherHogan commented 3 years ago

When I build this branch and run the tests on Ares, test 1 (unordered_map) fails and test 2 (map) hangs. I'm running the tests as follows:

Are you getting passing tests by following those steps?

hariharan-devarajan commented 3 years ago

What was the configuration you ran it with? # of processes and #clients per server?

ChristopherHogan commented 3 years ago

Whatever ctest defaults to.

ChristopherHogan commented 3 years ago

Looks like the default is home/chogan/local/bin/mpirun "-np" "4" "-f" "/home/chogan/hcl/build/test/hostfile" "/home/chogan/hcl/build/test/unordered_map_test" "2" "500" "1000" "1" "0"

ChristopherHogan commented 3 years ago

And hostfile is

ares-comp-29:2
ares-comp-30:2
hariharan-devarajan commented 3 years ago

Ok currently. Thallium in HCL has other issues (due to the upgrade of thallium to 0.8.3). I create a test case unordered_map such as that it would run locally on 1 process on shared memory. I verified the memory addresses as you had recommended.

ChristopherHogan commented 3 years ago

Do you want to fix the thallium issue in another PR (as well as #8), and then rebase this PR?

hariharan-devarajan commented 3 years ago

Yes, I would prefer that. There are a lot of issues with the new Thallium 0.8.3. Maybe create an integration issue with thallium 0.8.3. If u can verify if u agree with the fix for shared memory and if u have any recommendations we do the changes and close this issue.

ChristopherHogan commented 3 years ago

My preference would be the following steps:

  1. (optional but ideal) Fix #9 so we don't have to wait forever for CI to run (you can just copy the existing hermes setup)
  2. Fix #8 so that our CI is building with the latest version of thallium
  3. Make sure the test suite is run in CI (I think it only runs with rpclib at the moment)
  4. Add an automated test that verifies that this PR fixes issue #5.
  5. Rebase this PR so that CI will run and show the tests passing.

If u can verify if u agree with the fix for shared memory and if u have any recommendations we do the changes and close this issue.

In principle, the fix seems fine, but it doesn't make sense for me to do code review until I see passing tests in CI that show that the issue is fixed.