Safe-DS / Runner

Execute Safe-DS programs that were compiled to Python.
MIT License
2 stars 0 forks source link

CUDA tensors not released in time #78

Closed SmiteDeluxe closed 5 months ago

SmiteDeluxe commented 6 months ago

Describe the bug

Whenever you fetch an image through a call such as getHistogram(), the CUDA shared tensors are not all released. Then when you execute an image call on the same column again, the execution crashes because of "RuntimeError: Attempted to send CUDA tensor received from another process; this is not currently supported. Consider cloning before sending." 2 Image calls on different columns work fine. Also 1 of them in first execution, the other in second execution.

To Reproduce

  1. Import a table
  2. from any column get the histogram
  3. execute pipeline
  4. execute again for the crashed execution

Expected behavior

Release of tensors probably and thus no crashes.

Screenshots (optional)

image and then second execution: WhatsApp Bild 2024-04-05 um 15 10 51_5c372a57

Additional Context (optional)

No response

lars-reimann commented 5 months ago

@SmiteDeluxe Is this still happening? I could not reproduce this with current versions of the runner, and I've not seen it during the evaluation either.

WinPlay02 commented 5 months ago

@SmiteDeluxe Is this still happening? I could not reproduce this with current versions of the runner, and I've not seen it during the evaluation either.

This could have been fixed by #87, as IPC with placeholders was changed to serialize values first, then send the shared-memory location to the manager process instead of directly sending the values to the manager process. (Maybe this previously also caused the diagrams to be sometimes empty?)

SmiteDeluxe commented 5 months ago

@SmiteDeluxe Is this still happening? I could not reproduce this with current versions of the runner, and I've not seen it during the evaluation either.

Don't think so. Have tried to replicate by showing the same image twice but all seems to be working.