oneapi-src / unified-memory-framework

A library for constructing allocators and memory pools. It also contains broadly useful abstractions and utilities for memory management. UMF allows users to manage multiple memory pools characterized by different attributes, allowing certain allocation types to be isolated from others and allocated using different hardware resources as required.
https://oneapi-src.github.io/unified-memory-framework/
Other
41 stars 29 forks source link

Implement backward compatibility support in UMF #908

Open vinser52 opened 1 week ago

vinser52 commented 1 week ago

Environment Information

Please provide a reproduction of the bug:

  1. Install oneAPI 2025.0 base kit (the first release with UMF).
  2. Build some SYCL sample. I used the MonteCarloPi example from the oneAPI-samples repository.
  3. Make sure that you can run it without errors:
    
    svinogra@gkdse-dnp-01:~/oneAPI-samples/DirectProgramming/C++SYCL/MapReduce/MonteCarloPi/build$ make run
    Calculating estimated value of pi...

Running on Intel(R) Data Center GPU Max 1100 The estimated value of pi (N = 10000) is: 3.1652

Computation complete. The processing time was 0.663949 seconds. The simulation plot graph has been written to 'MonteCarloPi.bmp' Built target run

4. Now try to run with the development version of `libumf.so`. I compiled UMF from the source on my development machine. An tried to run the same example with `LD_LIBRARY_PATH` environment to use my development version of `libumf.so`.
```bash
svinogra@gkdse-dnp-01:~/oneAPI-samples/DirectProgramming/C++SYCL/MapReduce/MonteCarloPi/build$ LD_LIBRARY_PATH=~/unified-memory-framework/build/lib/:$LD_LIBRARY_PATH make run
Calculating estimated value of pi...
montecarlopi: /user/svinogra/unified-memory-framework/src/memory_provider.c:190: umfMemoryProviderCreate: Assertion `ops->version == UMF_VERSION_CURRENT' failed.
Aborted (core dumped)

How often bug is revealed:

always

Actual behavior:

Crashing happens because the UMF code has multiple places where we compare the current version with the version the client was built from.

Expected behavior:

The newer version of UMF should be backward compatible with the old one. E.g. oneAPI base kit 2025.0 should work with newer version of UMF.

Details

Today all our checks require an exact match, but we need to require that the current version of UMF is equal or greater that the version used by the client.

Additional information about Priority and Help Requested:

Requested priority: Showstopper

bratpiorka commented 1 week ago

But the "ops->version == UMF_VERSION_CURRENT" assert is fine. We want to bump "UMF_VERSION_CURRENT" on each minor oneAPI release. We plan to have a stable API (UMF 1.0) at 2026 oneAPI release. Only since then we had to keep the backward compatibility.

@pbalcer any comments on that?

vinser52 commented 1 week ago

I understand that we still have no 1.0 version and can ignore backward compatibility, but it is inconvenient for development purposes. Why not support it now? It would simplify our development and we will need to implement this logic anyway, why not do it now if it is possible?

vinser52 commented 1 week ago

Another thought:

Since oneAPI 2025.0 compiler now uses UMF we can extend our CI to test the compiler with the mainline version of UMF. Without backward compatibility it is not possible.

lplewa commented 6 days ago

First of all we need backward compatibility tests. Without such tests, we must increase API version on each release, as we cannot guarantee anything. With those tests we can bump versions in more conscious manner.(as test will fail)

Anyway - currently we have 0.X version. And based on last few weeks, we are not ready, to commit to stable API. I do not say that we can indicate api changes in more correct way, but as doing a lot of API changes ATM, probably this exercise is not huge priority, as we will bump the version on each release anyway.

But adding compatibility tests is something which we should have in our backlog.

vinser52 commented 6 days ago

First of all we need backward compatibility tests. Without such tests, we must increase API version on each release, as we cannot guarantee anything. With those tests we can bump versions in more conscious manner.(as test will fail)

Anyway - currently we have 0.X version. And based on last few weeks, we are not ready, to commit to stable API. I do not say that we can indicate api changes in more correct way, but as doing a lot of API changes ATM, probably this exercise is not huge priority, as we will bump the version on each release anyway.

But adding compatibility tests is something which we should have in our backlog.

I am not saying that we should commit to backward compatibility now and release the first stable version 1.0. The issue I see right now is simple: I built UMF from the source and tried to test it with the oneAPI 2025.0 compiler. And it does not work just because our code has a strict requirement that versions should exactly match. Even 0.10 version that we have in our main branch is not compatible with 0.9 version in 2025.0 release, just because of this strict check.

Why I think it is important, we are collaborating with MPI and oneCCL teams and it will be hard to make experiments. So, we will need to implement logic to check versions anyway. Why not start doing it soon? Of course, until 1.0 we do not guarantee backward compatibility, but if 0.10 is backward compatible with 0.9 why not to allow that?