bytedeco / javacpp-presets

The missing Java distribution of native C++ libraries
Other
2.68k stars 744 forks source link

[PyTorch] Update to 2.4.0, add distributed #1510

Closed HGuillemet closed 2 months ago

HGuillemet commented 5 months ago

Included in this PR:

Remains to be done:

HGuillemet commented 4 months ago

@halfeng @sbrunk This PR can be tested using artifact fr.apteryx:pytorch-platform-gpu:2.3.1-1.5.11-SNAPSHOT or fr.apteryx:pytorch-platform:2.3.1-1.5.11-SNAPSHOT from maven repo https://s01.oss.sonatype.org/content/repositories/snapshots.

saudet commented 4 months ago

@HGuillemet Your account doesn't already have permission to publish for org.bytedeco??

HGuillemet commented 4 months ago

I don't have the secrets. Anyway PR builds don't publish.

saudet commented 4 months ago

You should be able to publish to org.bytedeco using your own account. I'm pretty sure I gave you access a long time ago.

saudet commented 4 months ago

Or maybe not, what the name of your OSSRH account?

HGuillemet commented 4 months ago

HGuillemet. But don't bother.

saudet commented 3 months ago

@HGuillemet Let's just merge this and fix anything broken with the upgrade to PyTorch 2.4?

HGuillemet commented 3 months ago

I'm about to add a commit for 2.4.0 upgrade

saudet commented 3 months ago

Ok, we can do that too

HGuillemet commented 3 months ago

Before merging, you must decide what to do with bytedeco/javacpp#766

HGuillemet commented 3 months ago

Also to decide: whether adding the cuda dependency is a good thing or not.

saudet commented 3 months ago

An optional dependency on the presets for CUDA that only torch_cuda needs, sure, that's fine.

HGuillemet commented 3 months ago

An optional dependency on the presets for CUDA that only torch_cuda needs, sure, that's fine.

Consider my comment on top post: the dependency will be mandatory when using CUDA. Presently, most users use their system cuda I guess, they will now have to use cuda presets.

saudet commented 3 months ago

Sure, that's fine, it's not a huge dependency

HGuillemet commented 3 months ago

@saudet please restart the jobs that failed

saudet commented 3 months ago

Please upgrade the builds to CUDA 12.6, see commit https://github.com/bytedeco/javacpp-presets/commit/6853dc41a91cb46bc6c19484cae8f2a333cd8d7e

HGuillemet commented 3 months ago

Pytorch uses cupti for profiling, but it's not provided by the cuda presets. Could we add it ?

saudet commented 3 months ago

Sure, feel free to give it a try

HGuillemet commented 3 months ago

Ready for me. We can look at MPI or UCC distributed backend later if needed. Artifacts for this PR are available on usual snapshot repository as org.bytedeco:pytorch-platform-gpu:2.4.0-1.5.11-SNAPSHOT or org.bytedeco:pytorch-platform:2.4.0-1.5.11-SNAPSHOT.