rapidsai / legate-boost

GBM implementation on Legate
https://rapidsai.github.io/legate-boost/
Apache License 2.0
8 stars 8 forks source link

legate runtime maps distributed data to single partition #69

Open mfoerste4 opened 1 year ago

mfoerste4 commented 1 year ago

When running legateboost on multiple GPUs, the legate runtime maps followup cunumeric tasks to individual instances despite the data being distributed. The issue is being described and discussed in more detail here.

This causes performance to degrade as some code portions are being executed sequentially.

Current workaround is to run distributed code with the environment variable LEGATE_MIN_GPU_CHUNK=1 or LEGATE_TEST=1.

RAMitchell commented 8 months ago

Is this potentially the same issue as the cunumeric repeat issue we just found?