microsoft / superbenchmark

A validation and profiling tool for AI infrastructure
https://aka.ms/superbench
MIT License
248 stars 55 forks source link

Bug Fix - Bug fix for cuda 12.2 dockerfile LD_LIBRARY_PATH issue #614

Closed RyoYang closed 5 months ago

RyoYang commented 5 months ago

Description Cuda 12.2 image will report undfined symbol error due to incomplete LD_LIBRARY_PATH:

image

How to reproduce:

  1. Deploy sb with cuda12.2 image
    sb deploy -f local.ini -i superbench/superbench:v0.10.0-cuda12.2
  2. Enter to the container
    sudo docker exec -it sb-workspace bash
  3. Execute mpirun:
    root@sb-container:~# mpirun
    mpirun: symbol lookup error: mpirun: undefined symbol: opal_libevent2022_event_base_loop

    Fix to fix

    • Append hpcx_load into /etc/bash.bashrc for updaing env LD_LIBRARY_PATH in each time
codecov[bot] commented 5 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 85.79%. Comparing base (2c88db9) to head (763a4a4).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #614 +/- ## ======================================= Coverage 85.79% 85.79% ======================================= Files 97 97 Lines 6912 6912 ======================================= Hits 5930 5930 Misses 982 982 ``` | [Flag](https://app.codecov.io/gh/microsoft/superbenchmark/pull/614/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft) | Coverage Δ | | |---|---|---| | [cpu-python3.6-unit-test](https://app.codecov.io/gh/microsoft/superbenchmark/pull/614/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft) | `71.62% <ø> (ø)` | | | [cpu-python3.7-unit-test](https://app.codecov.io/gh/microsoft/superbenchmark/pull/614/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft) | `71.62% <ø> (ø)` | | | [cpu-python3.8-unit-test](https://app.codecov.io/gh/microsoft/superbenchmark/pull/614/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft) | `72.03% <ø> (ø)` | | | [cuda-unit-test](https://app.codecov.io/gh/microsoft/superbenchmark/pull/614/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft) | `83.87% <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.