nv-morpheus / Morpheus

Morpheus SDK
Apache License 2.0
354 stars 131 forks source link

[BUG]: DOCA Source Stage cleanup error #1559

Open e-ago opened 7 months ago

e-ago commented 7 months ago

Version

24.03

Which installation method(s) does this occur on?

Docker

Describe the bug.

When killing the DOCA Source Stage for UDP traffic some low level mlx5 cleanup function fails. Need to debug and fix

Minimum reproducible example

# python3 ./examples/doca/run.py --nic_addr ca:00.0 --gpu_addr 17:00.0 --traffic_type udp
...
Ctrl+C

Relevant log output

*** Aborted at 1710164283 (unix time) try "date -d @1710164283" if you are using GNU date ***
PC: @                0x0 (unknown)
*** SIGSEGV (@0x7f30fffd9a30) received by PID 3509 (TID 0x7f38f17fe640) from PID 18446744073709394480; stack trace: ***
    @     0x7f3c228df197 google::(anonymous namespace)::FailureSignalHandler()
    @     0x7f3c25567520 (unknown)
    @     0x7f39159026d4 mlx5_hws_cnt_svc
    @     0x7f3c255b9ac3 (unknown)
    @     0x7f3c2564b850 (unknown)
    @                0x0 (unknown)
Segmentation fault (core dumped)

Full env printout

Click here to see environment details

 [Paste the results of print_env.sh here, it will be hidden by default]

Other/Misc.

No response

Code of Conduct

jarmak-nv commented 7 months ago

Hi @e-ago!

Thanks for submitting this issue - our team has been notified and we'll get back to you as soon as we can! In the mean time, feel free to add any relevant information to this issue.

efajardo-nv commented 2 months ago

Hi @e-ago. Can this issue be closed? Not sure if it was resolved by PR #1475.