openucx / ucx

Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)
http://www.openucx.org
Other
1.14k stars 423 forks source link

Request support of HPE variant of XPMEM #5615

Open dkokron opened 4 years ago

dkokron commented 4 years ago

I'm working on the Pleiades system which is mostly SGI (now HPE) hardware. This system has a version of XPMEM that appears to be incompatible with UCX (1.9.0). The UCX configure fails because the HPE version of XPMEM does not expose a symbol called xpmem_init.

configure:29310: checking for xpmem_init in -lxpmem configure:29335: icc -o conftest -I/usr/include/sn/include -L/usr/lib64 -lxpmem conftest.c -lxpmem -lpthread -lrt -lrt -ldl >&5 /usr/lib64/gcc/x86_64-suse-linux/4.8/../../../../x86_64-suse-linux/bin/ld: /var/tmp/pbs.9010.pbspl4.nas.nasa.gov/iccGp2op8.o: in function main': conftest.c:(.text+0x2d): undefined reference toxpmem_init'

PBS r101i0n12 32> nm /usr/lib64/libxpmem.a | grep init PBS r101i0n12 33> nm /usr/lib64/libxpmem.so | grep init 0000000000201dd0 t __frame_dummy_init_array_entry 0000000000000b48 T _init

Configuring against the hjelmn variant (https://github.com/hjelmn/xpmem) does work and appears to run on the system (based on performance comparisons)

configure:29309: checking for xpmem_init in -lxpmem configure:29334: icc -o conftest -I/xxxx/XPMEM/hjelmn/xpmem/install/include -L/xxxx/XPMEM/hjelmn/xpmem/install/lib -lxpmem conftest.c -lxpmem -lpthread -lrt -lrt -ldl >&5 configure:29334: $? = 0 configure:29343: result: yes configure:29346: checking xpmem.h usability configure:29346: icc -c -I/xxxx/XPMEM/hjelmn/xpmem/install/include conftest.c >&5 configure:29346: $? = 0 configure:29346: result: yes configure:29346: checking xpmem.h presence

nm /xxxx/XPMEM/hjelmn/xpmem/install/lib/libxpmem.a | grep init 0000000000000000 T xpmem_init

nm /xxxx/XPMEM/hjelmn/xpmem/install/lib/libxpmem.so | grep init 0000000000201de0 t __frame_dummy_init_array_entry 00000000000007d8 T _init 0000000000000980 T xpmem_init

Are there plans to support XPMEM from HPE? Dan

yosefe commented 4 years ago

@dkokron thanks for opening an issue! currently the community has no specific plan to support additional variants of xpmem, however if anyone wishes to contribute code to UCX to support HPE xpmem, it would be welcomed :)

dkokron commented 4 years ago

Which variant of XPMEM is supported by UCX? Can you point me to the XPMEM code you are using?

yosefe commented 4 years ago

Which variant of XPMEM is supported by UCX? Can you point me to the XPMEM code you are using?

https://github.com/hjelmn/xpmem

dkokron commented 4 years ago

Thanks, that's the one I'm building UCX against. Doesn't seem to complain when I run on our system which has a different XPMEM kernel module. Dan

On Tue, Aug 25, 2020 at 11:52 AM Yossi Itigin notifications@github.com wrote:

Which variant of XPMEM is supported by UCX? Can you point me to the XPMEM code you are using?

https://github.com/hjelmn/xpmem

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openucx/ucx/issues/5615#issuecomment-680145512, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACODV2E74IPTMLEYGYPZPC3SCPT3XANCNFSM4QK2A3BA .

shamisp commented 4 years ago

@dkokron part of the problem is that the development team does not have access to proprietary versions of XPMEM, so it is challenging to test and develop without platform access. As @yosefe mentioned, code contributions are more than welcome :-)

dkokron commented 4 years ago

Pavel, I have confirmed with HPE that their version of the XPMEM package is open-source with a GPL/LGPL license. Would you like a copy? Dan

On Tue, Aug 25, 2020 at 3:31 PM Pavel Shamis (Pasha) < notifications@github.com> wrote:

@dkokron https://github.com/dkokron part of the problem is that the development team does not have access to proprietary versions of XPMEM, so it is challenging to test and develop without platform access. As @yosefe https://github.com/yosefe mentioned, code contributions are more than welcome :-)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openucx/ucx/issues/5615#issuecomment-680254302, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACODV2F2ZRH56A7MA2H23QLSCQNRBANCNFSM4QK2A3BA .

shamisp commented 4 years ago

If they can publish it on GitHub, it will be helpful. The best thing they can do is to open an PR to https://github.com/hjelmn/xpmem. Once it is in open, we can give it a try.