It is valuable to provide a consistent way of constructing arrays and using arguments. For users, they won't be confused by the many ways of calling a function or be forced to scan the long lists of function overloads to pick the best one. For developers, it reduces the number of interfaces of each function/struct that are exposed to the outside, which potentially reduces the cost of maintenance.
Therefore, I raise this feature request to discuss the best practice and convention we need to follow. The core principle is "to provide only one interface for each function".
Please let me know your opinion and I will update the texts to reflect our agreements.
1) Overload should not change the nature of the main arguments. Here is an example:
# Not good practice
array(object: List[Int], shape: List[Int])
array(shape: List[Int], fill: Boolean)
array(*shape: Int, eye: Boolean)
This is not good practice. The array function is "to construct an array from an object". The main argument is an object. However, the second overload reads in a shape, which changes the nature of the main arguments.
In this case, the first argument can be either a shape or an object, which causes confusion for users and is prone to mistakes (Will it return an array [2,2], or an array of 2x2?).
A better way is to assign different names to functions. This makes sure that each function deals with only one specific topic. (In other words, do not abuse the function overloads)
# Good practice
full(shape: List[Int], fill: Boolean)
array(object: List[Int], shape: List[Int])
eye(*shape: Int)
2) Overload can be used when the types have the same nature. Here is an example:
# Good practice
zeros(shape: List[Int])
zeros(shape: StaticTuple[Int])
zeros(shape: NDArrayShape[Int])
Though the types are different, the argument is the same. Moreover, the three types have the same nature, e.g., a shape-like container. Thus, this does not cause confusion. A disadvantage is that we cannot enumerate all the types, but these three are sufficient.
Notes: these three functions can be combined once Mojo has a better "traits".
# Best practice
zeros[T: ShapeLike](shape: T)
3) Use the same wrapper for the same argument. Here is an example:
# Not good practice
array(obj: PyObject, shape: List[Int])
NDArray(* shape: Int)
zeros(shape: StaticTuple[Int])
This is not good practice. The type of shape argument is inconsistent within NuMojo package, significantly increasing the cost of study for a new user. We should try to align the wrapper in all cases:
# Good practice
array(obj: PyObject, shape: Shape[Int])
NDArray(shape: Shape[Int])
zeros(shape: Shape[Int])
About the Item (3), we can already think about some wrappers for common arguments:
Data points as input: It can be wrapped with List (Iterable trait). nm.array(List[Int](1,2,3,4,5)).
Index of a specific item: It should always be wrapped with Idx. A[Idx(1,2)]. We already have it.
Index/slides of a sub-array: It should be expressed as Slide or Int. A[1,2:3].
Index to get a series of sub-array: It should be expressed as List[DType.index].
Shape: It should be wrapped with ShapeLike types, e.g., List[Int], VariadicList[Int], NDArrayShape (Shape), even Idx.
Related to #90 #110.
It is valuable to provide a consistent way of constructing arrays and using arguments. For users, they won't be confused by the many ways of calling a function or be forced to scan the long lists of function overloads to pick the best one. For developers, it reduces the number of interfaces of each function/struct that are exposed to the outside, which potentially reduces the cost of maintenance.
Therefore, I raise this feature request to discuss the best practice and convention we need to follow. The core principle is "to provide only one interface for each function".
Please let me know your opinion and I will update the texts to reflect our agreements.
1) Overload should not change the nature of the main arguments. Here is an example:
This is not good practice. The
array
function is "to construct an array from an object". The main argument is an object. However, the second overload reads in a shape, which changes the nature of the main arguments.In this case, the first argument can be either a
shape
or anobject
, which causes confusion for users and is prone to mistakes (Will it return an array [2,2], or an array of 2x2?).A better way is to assign different names to functions. This makes sure that each function deals with only one specific topic. (In other words, do not abuse the function overloads)
2) Overload can be used when the types have the same nature. Here is an example:
Though the types are different, the argument is the same. Moreover, the three types have the same nature, e.g., a shape-like container. Thus, this does not cause confusion. A disadvantage is that we cannot enumerate all the types, but these three are sufficient.
Notes: these three functions can be combined once Mojo has a better "traits".
3) Use the same wrapper for the same argument. Here is an example:
This is not good practice. The type of
shape
argument is inconsistent within NuMojo package, significantly increasing the cost of study for a new user. We should try to align the wrapper in all cases:About the Item (3), we can already think about some wrappers for common arguments:
List
(Iterable
trait).nm.array(List[Int](1,2,3,4,5))
.Idx
.A[Idx(1,2)]
. We already have it.A[1,2:3]
.List[DType.index]
.ShapeLike
types, e.g.,List[Int]
,VariadicList[Int]
,NDArrayShape
(Shape
), evenIdx
.