ml-explore / mlx

MLX: An array framework for Apple silicon
https://ml-explore.github.io/mlx/
MIT License
16.89k stars 972 forks source link

[BUG] Missing types #1240

Closed marcospgp closed 4 days ago

marcospgp commented 4 months ago

There are many missing types in the library, with many functions having Unknown parameter or return types:

image image

I think it would be a great improvement to have full typing information across the MLX API.

awni commented 3 months ago

I will make this a feature and leave it open. I think it's a good idea to have useful type hints for the stable Python API.

AbhinavMir commented 3 months ago

I ran a quick Python script to see which files are missing type annotations, does this look right?

Report: https://pastebin.com/raw/PepZW2jM

awni commented 3 months ago

I would only consider files in the subdirectory python/mlx for now. We don't need to add strict type hints to benchmarks and tests.

AbhinavMir commented 3 months ago

That makes sense - I've modified this here. I was just crawling .py files, changed that around.

https://github.com/AbhinavMir/mlx/blob/typeAnno/type_annotation_report.md

ManishAradwad commented 3 months ago

I have added some type hints for few functions given in above file by @AbhinavMir

Please let me know if they're right

AbhinavMir commented 3 months ago

One of my friends @catplotlib wants to try Open Source, tagging her here so she can pick up a few type hints!

Jonathan-Dobson commented 2 months ago

I second the motion to add typings. The pip dist of mlx.core does not appear to offer type hints.

jbcoe commented 2 months ago

I added https://github.com/ml-explore/mlx/issues/1345 for missing type hints in pip.

Saanidhyavats commented 2 months ago

The issue is partially solved through #1243. Since the PR is inactive for more than a month, I will be continuing the work and create a new pr for updated changes.

keli-wen commented 2 weeks ago

Description

✨ Hi @awni, I'm very appreciative of the MLX community. As an Apple enthusiast, I believe I can contribute to it. I've made contributions to open-source communities before, and I've identified this good first issue.

I consider adding typing annotations to Python code necessary and valuable. After reviewing the issue's history, I noticed it hasn't been fully resolved. I also hope to familiarize myself with the MLX framework's implementation/functionality during this process.

Change Proposal

As a student with an AI background, I'm initially focusing on the implementations under python/mlx/nn/layers. I'll use Dropout as an example (code link):

class Dropout(Module):
    r"""Randomly zero a portion of the elements during training.

    The remaining elements are multiplied with :math:`\frac{1}{1-p}` where
    :math:`p` is the probability of zeroing an element. This is done so the
    expected value of a given element will remain the same.

    Args:
        p (float): The probability to zero an element
    """

    def __init__(self, p: float = 0.5):
        super().__init__()

        if p < 0 or p >= 1:
            raise ValueError(f"The dropout probability {p} is not in [0, 1)")

        self._p_1 = 1 - p

    def _extra_repr(self):
        return f"p={1-self._p_1}"

    def __call__(self, x):
        if self._p_1 == 1 or not self.training:
            return x

        mask = mx.random.bernoulli(self._p_1, x.shape)

        return (mask * x) * (1 / self._p_1)

To add typing annotations, a straightforward approach would be to reference PyTorch's implementation (code link):

from torch import Tensor

class Dropout(_DropoutNd):
    ...

    def forward(self, input: Tensor) -> Tensor:
        return F.dropout(input, self.p, self.training, self.inplace)

Aligning with PyTorch, we might do something like this:

class Dropout(Module):
    r"""Randomly zero a portion of the elements during training.

    The remaining elements are multiplied with :math:`\frac{1}{1-p}` where
    :math:`p` is the probability of zeroing an element. This is done so the
    expected value of a given element will remain the same.

    Args:
        p (float): The probability to zero an element
    """

    def __init__(self, p: float = 0.5):
        super().__init__()

        if p < 0 or p >= 1:
            raise ValueError(f"The dropout probability {p} is not in [0, 1)")

        self._p_1 = 1 - p

-   def _extra_repr(self):
+   def _extra_repr(self) -> str:
        return f"p={1-self._p_1}"

-   def __call__(self, x):
+   def __call__(self, x: mx.array) -> mx.array:
        if self._p_1 == 1 or not self.training:
            return x

        mask = mx.random.bernoulli(self._p_1, x.shape)

        return (mask * x) * (1 / self._p_1)

Issue Encountered

[!NOTE] This conclusion represents only my personal understanding. It may be incorrect.

Although this appears to complete the task, I've noticed that PyTorch wraps a Tensor Python implementation on top of C++, while MLX doesn't. Consequently, despite these changes, in IDEs (like VSCode), it's equivalent to def __call__(self, x: Any) -> Any:.

Regarding this issue, I'd like to ask: how can we improve the readability and type hinting experience for MLX?

awni commented 1 week ago

We have stubs in Python MLX distribution which should tell the IDEs what the types are.

If you look in the path python -c "import os; import mlx.core as mx; print(os.path.dirname(mx.__file__))" you should see the file core/__init__.pyi.

keli-wen commented 1 week ago

We have stubs in Python MLX distribution which should tell the IDEs what the types are.

If you look in the path python -c "import os; import mlx.core as mx; print(os.path.dirname(mx.__file__))" you should see the file core/__init__.pyi.

Since I build from source, I noticed I was missing the following command:

python setup.py generate_stubs

After running the command, the core/__init__.pyi file was generated. However, I encountered an issue: in random/__init__.pyi, certain types such as Union, array, and scalar were not imported.

import mlx.core

def bernoulli(p: Union[scalar, array] = 0.5, shape: Optional[Sequence[int]] = None, key: Optional[array] = None, stream: Union[None, Stream, Device] = None) -> array:
    ...

Although I can navigate to the function definition, the type hints aren't showing up properly. For example (in vscode):

(function) def bernoulli(
    p: Any = 0.5,
    shape: Any = None,
    key: Any = None,
    stream: Any = None
) -> Any

Does this indicate an issue with nanobind.stubgen, or should I create a new issue to discuss it further?

$ python --version
Python 3.10.14
awni commented 1 week ago

It could be a bug with the type hints.

awni commented 4 days ago

This should be fixed in the latest MLX. Please let us know if you still see a problem in the stubs.