conda-forge / cupy-feedstock

A conda-smithy repository for cupy.
BSD 3-Clause "New" or "Revised" License
5 stars 23 forks source link

cuGraph dependency #112

Closed jakirkham closed 3 years ago

jakirkham commented 3 years ago

There is some discussion going on about using RAPIDS cuGraph for some of the graph operations that will be add to CuPy (mirroring APIs of this in SciPy) ( https://github.com/cupy/cupy/issues/4219 ). One of the questions that comes out of this is how we handle the dependency for the cupy package. This depends a bit on how CuPy uses cuGraph (as such this is not totally flushed out). In any event am raising here for awareness/early discussion about how to handle this issue. Here are some thoughts on how this dependency gets managed:

  1. Reuse Conda package for cuGraph from RAPIDS
  2. Package cuGraph in conda-forge
  3. Use a vendored cuGraph from CuPy (maybe CuPy uses a git submodule? maybe it gets cloned and built in the recipe?)
  4. Make cuGraph an optional runtime dependency (using Python API, dlopen, or similar; this depends on how cuGraph is used by CuPy)
  5. ?

These are very rough ideas and likely subject to change, but hopefully this starts the discussion. Should add the typical response to adding packages from other channels in conda-forge is no this is not an option. IOW 1 is not likely feasible, but have listed it for completeness anyways. 2 likely has challenges as well as there is an open question of who would maintain this and may lead to duplicated effort (RAPIDS will likely still build this in its own channel anyways). 3 would increase the complexity of the build here (and possibly for CuPy wheels depending on how it is done). 4 may increase the complexity of code added to CuPy. If others have different suggestions on how to handle these issue, that would be welcome

cc @rlratzel @pentchev @anaruse @kmaehashi @leofang

leofang commented 3 years ago

2 likely has challenges as well as there is an open question of who would maintain this and may lead to duplicated effort (RAPIDS will likely still build this in its own channel anyways).

Can't we just copy & paste RAPIDS's recipe to CF, and make sure they're always in sync? I fail to see why this is complicated.

Another question: How large is libcugraph in size (for which I obviously refer to the shared library, not code)?

jakirkham commented 3 years ago

Yeah that is option 2

leofang commented 3 years ago

I know, I mean I don't see why it is difficult. Is there something special in the rapids channel?

jakirkham commented 3 years ago

It's more work for people to maintain it in 2 places

leofang commented 3 years ago

Another question: How large is libcugraph in size (for which I obviously refer to the shared library, not code)?

Answering myself: It's getting quite large, about 50~90MB, see https://anaconda.org/rapidsai/libcugraph/files. Doesn't seem to be a good idea to bundle it in CuPy...

leofang commented 3 years ago

I think your option 4 is not mutually exclusive to other options. In fact I think it's following the plan we have for other optional dependencies (cudnn, nccl, cutensor, see #109), so perhaps we can do the same for libcugraph?

Of course, I am speaking from the perspective of a recipe maintainer. Ultimately it depends on how it's used in upstream.

kmaehashi commented 3 years ago

How about mixing ideas 1 and 4?

The downsides is that version restrictions cannot be applied to libcugraph. Maybe upstream fix (check libcugraph version is compatible in cupy side) is needed.

pentschev commented 3 years ago

My understanding is that the problem is conda-forge doesn't allow packages from 3rd-party channels, maybe @jakirkham can confirm that. Otherwise, this is what I also had in mind in the beginning, so if it's possible, I'm +1 for that.

leofang commented 3 years ago

Mixing channels would open a can of worms. Over the years we learned in the hard way that it must be avoided, unfortunately. I am not willing to maintain such an error-prone approach.

leofang commented 3 years ago

It's more work for people to maintain it in 2 places

John might have a point on this. It takes some work to port this recipe to Conda-Forge: https://github.com/rapidsai/cugraph/blob/main/conda/recipes/libcugraph/meta.yaml Note that it depends on rmm, libcudf, etc that only live in the rapids channel...

rlratzel commented 3 years ago

Note that it depends on rmm, libcudf, etc that only live in the rapids channel...

FWIW, our current recipe no longer depends on libcudf, and the recipe for the librmm build dependency should not depend on any other RAPIDS packages.

So it sounds like for the approach of copying the recipe to CF, it would actually require copying both libcugraph and librmm recipes (since building libcugraph requires librmm, and librmm is also only currently available from rapidsai)?

leofang commented 3 years ago

So it sounds like for the approach of copying the recipe to CF, it would actually require copying both libcugraph and librmm recipes (since building libcugraph requires librmm, and librmm is also only currently available from rapidsai)?

Sounds like a viable path to me!

leofang commented 3 years ago

Unfortunately in addition to libcugraph we also need rmm to appear on Conda-Forge, see conda-forge/libcugraph-feedstock#10. This would take some time to resolve by someone other than me (maybe @rlratzel...?), but before that I'll release CuPy v9.0 without the cuGraph support. Once the issue is resolved we can rebuild CuPy to enable cuGraph.