oneapi-src / oneDAL

oneAPI Data Analytics Library (oneDAL)
https://software.intel.com/en-us/oneapi/onedal
Apache License 2.0
617 stars 214 forks source link

fix: avoid additional array allocation in host to device transfer #2966

Open Alexandr-Solovev opened 2 weeks ago

Alexandr-Solovev commented 2 weeks ago

Description

PR introduces fix for memcpy function in oneDAL. It saves memory and improve the performance of transferring data from host to device and back and gives an opportunity to use memcpy directly.


PR should start as a draft, then move to ready for review state after CI is passed and all applicable checkboxes are closed. This approach ensures that reviewers don't spend extra time asking for regular requirements.

You can remove a checkbox as not applicable only if it doesn't relate to this PR in any way. For example, PR with docs update doesn't require checkboxes for performance while PR with any change in actual code should have checkboxes and justify how this code change is expected to affect performance (or justification should be self-evident).

Checklist to comply with before moving PR from draft:

PR completeness and readability

Testing

Performance

Alexandr-Solovev commented 2 weeks ago

/intelci: run

Alexandr-Solovev commented 2 weeks ago

Please dont merge, until the additional check on local machines

Alexandr-Solovev commented 2 weeks ago

Looks like #2375 was just missing the wait_and_throw() :(

not sure, looks like in all places where the function is called there is additional wait_and_throw()

Alexandr-Solovev commented 4 days ago

/intelci: run

Alexandr-Solovev commented 3 days ago

/intelci: run

Alexandr-Solovev commented 3 days ago

/intelci: run

Alexandr-Solovev commented 3 days ago

/intelci: run

ethanglaser commented 2 days ago

Are you still seeing seg faults when experimenting locally or has this issue been resolved?