rapidsai / cuml

cuML - RAPIDS Machine Learning Library
https://docs.rapids.ai/api/cuml/stable/
Apache License 2.0
4.03k stars 521 forks source link

[FEA]Request for Type Stubs Package for cuML to Enhance Developer Experience #5823

Open Apolsus opened 3 months ago

Apolsus commented 3 months ago

Is your feature request related to a problem? Please describe. Yes, the problem revolves around the difficulty of installing cuML in various environments due to its heavy dependencies and specific hardware requirements (e.g., CUDA-enabled GPU). This situation poses a challenge for developers working in environments where installing cuML is not feasible, yet they wish to develop and maintain code that depends on cuML. The absence of a lightweight solution for type checking and code completion hampers productivity and increases the likelihood of runtime errors.

Describe the solution you'd like I propose the creation of a stub-cuml package, akin to typeshed for standard Python libraries, that contains .pyi type stubs for the cuML API. This package would not contain any functional code, only type annotations and signatures for cuML's public API. This would allow developers to enjoy improved static analysis, IntelliSense, and type checking in IDEs without the need to install cuML's full suite of dependencies.

Describe alternatives you've considered

Manually creating .pyi files for cuML interfaces used in projects, which is time-consuming and error-prone. Utilizing dynamic typing without the benefits of static analysis, which increases the risk of type-related errors in production. Attempting to use cuML in limited environments, leading to complex workarounds to manage installations and dependencies, which is not always possible or efficient. Additional context The presence of a stub-cuml package would significantly lower the barrier to entry for developers wishing to write cuML-dependent code in environments where installing the full library is impractical. This approach follows the precedent set by other complex libraries that have made their APIs more accessible and developer-friendly through the provision of type stubs. Not only would this enhance the development experience, but it would also foster a broader adoption of cuML by making it easier to integrate into a wider range of projects and applications.

The development of such a package could potentially be community-driven, with initial stubs generated using tools like MyPy's stubgen, and then iteratively refined by contributors. This could lead to a collaborative effort that benefits the entire cuML user base and beyond.

Thank you for considering this feature request. I believe it would make a significant positive impact on the developer community and the accessibility of cuML's powerful machine learning capabilities.

Apolsus commented 3 months ago

In fact, in many enterprises that use cuml, their computing cards are stored in a strict data control platform, and there is no way to develop directly on them. Tasks are run in the form of submitting configuration files and automatically pulling warehouse code.

dantegd commented 2 months ago

I love the idea of a stub-cuml package, hadn't really thought about it. It's something we will definitely consider as part of upcoming improvements and fixes.

vyasr commented 2 months ago

A similar discussion was started in the cudf repo: https://github.com/rapidsai/cudf/issues/15190. It would be great to standardize some approach to type stubs for Cython files across RAPIDS.