modularml / mojo

The Mojo Programming Language
https://docs.modular.com/mojo/manual/
Other
22.97k stars 2.59k forks source link

[stdlib] Use `normalize_index` throughout the standard library #2948

Open ConnorGray opened 4 months ago

ConnorGray commented 4 months ago

2677 added a new normalized_index[..](..) helper function to the standard library, used to perform bounds checking on indexes into collection types. This replaces boilerplate bounds checking and negative index normalization logic like:

        debug_assert(-size <= int(index) < size, "Index must be within bounds.")
        var normalized_idx = int(index)
        if normalized_idx < 0:
            normalized_idx += size

with a function call:

        var normalized_index = normalize_index["InlineArray"](index, self[])

That PR updated only InlineArray to use the new helper function.

We should update other collections in the standard library to use this function as well, including:

Please limit the number of call-sites updated to not more than 4-5 per PR, to make reviewing easier.

@gabrieldemarmiesse @laszlokindrat FYI

gabrieldemarmiesse commented 4 months ago

For anyone wanting to do this ticket, you can look at this diff for a good example of what to do: https://github.com/modularml/mojo/pull/2677/files (file stdlib/src/utils/static_tuple.mojo) and you can also use the Indexer trait which is the right trait to work with when using indexes.

Intable or Int are not recommended when indexing something because 1) Int is very restrictive, we might want to index with types like UInt32 or Int64 2) Intable is too broad and allows floats to be used for indexing which is obviously not correct since we can't do my_list[4.5]

StandinKP commented 4 months ago

I can take this up

laszlokindrat commented 4 months ago

For anyone wanting to do this ticket, you can look at this diff for a good example of what to do: https://github.com/modularml/mojo/pull/2677/files (file stdlib/src/utils/static_tuple.mojo) and you can also use the Indexer trait which is the right trait to work with when using indexes.

Intable or Int are not recommended when indexing something because

Actually, Int is implicitly convertible from Indexer, and it is currently the recommended way to define __getitem__ and other methods that take indices.

gabrieldemarmiesse commented 4 months ago

@laszlokindrat It may cause some confusion for readers of the code. One might assume that when implicitely converting to an Int, the method __int__ is used (because they have the same name, and it's more used in Python than __index__).

Using Indexer explicitely might also make it easier to explore our options when the compiler drops the auto-conversion of types when a default constructor is available.

@JoeLoser might have a different take on this one since he merged https://github.com/modularml/mojo/pull/2800 recently, which made use of the Indexer trait.

Anyway I don't want to get into a big debate here. If we don't agree on the subject, we can always talk about this again after the compiler drops the auto-implicit-conversion nonsense.

laszlokindrat commented 4 months ago

@laszlokindrat It may cause some confusion for readers of the code. One might assume that when implicitely converting to an Int, the method __int__ is used (because they have the same name, and it's more used in Python than __index__).

We document the implicit conversion behavior in Indexer. Also, in Python, the distinction between __int__ and __index__ is clear and the use of __index__ for slice creation is well known (which is the assumed semantics for __getitem__). So I don't see much room for confusion here; you can only pass types implicitly as Int that are either directly convertible using an Int constructor (few annointed types), or that implement Indexer (which implies that the type implementer understands the nuances). I do agree that implicit conversion can be confusing, but I don't think this Intable vs Indexer is the main source of that confusion.

gabrieldemarmiesse commented 4 months ago

Okay then, I won't push the matter further. So far I can only say that it confused me, and I had to double-check for bugs, and I was in this codebase for a while. So I wouldn't be surprised if the same thing happened for new contributors. But maybe it's just me.

We'll see with time if it really becomes an issue or not. Let's go with Int instead of Indexer for now.

StandinKP commented 4 months ago

Why did we remove the mojo-stdlib label?

JoeLoser commented 3 months ago

It got auto removed when the issue got moved internally in our Linear system. I'll add it back. FYI @ewa.

ematejska commented 3 months ago

I think this will not happen in the future anymore but flag me if you see anything strange.

vguerra commented 1 month ago

hello @ConnorGray , I submitted a PR to address the migration within List methods (#3400). As well, as of today, __setitem__ and __get_ref methods do not exist anymore so you can remove them from the todo list I think.

vguerra commented 1 month ago

hello @ConnorGray , I submitted a PR to address the migration within List methods (#3400). As well, as of today, __setitem__ and __get_ref methods do not exist anymore so you can remove them from the todo list I think.

There were already PRs addressing this (https://github.com/modularml/mojo/pull/3400#issuecomment-2306919174)