pyccel / pyccel

Python extension language using accelerators
MIT License
348 stars 56 forks source link

Type specification #1487

Open EmilyBourne opened 1 year ago

EmilyBourne commented 1 year ago

The current implementation of the type specification is a bit messy.

There are 3 possible methods.

  1. Headers in comments:

    #$ header function f2(int, int) results(int)
  2. Types decorator

    @types(int, int, results = int)
    def f2(x, y):
  3. Type hinting

    def f2(x : int, y : int) -> int:

The first method is the oldest, but the least familiar for the current devs. Most of the code hasn't been touched since it was written in 2018. A lot of the code for this section contains "TODO" comments.

The methods are also handled in different ways. Method 1 is handled in the SemanticParser : https://github.com/pyccel/pyccel/blob/2cce6b0270aa312b0764ff7d7909a17cde992b53/pyccel/parser/semantic.py#L3197-L3199

Method 2 is converted in the SyntacticParser from a decorator to a header: https://github.com/pyccel/pyccel/blob/2cce6b0270aa312b0764ff7d7909a17cde992b53/pyccel/parser/syntactic.py#L749-L756

Method 3 is handled in the SyntacticParser by creating a @types decorator: https://github.com/pyccel/pyccel/blob/2cce6b0270aa312b0764ff7d7909a17cde992b53/pyccel/parser/syntactic.py#L644-L653 which is then converted to a header

While it is good that all three methods rely on the same underlying code it is not ideal that the underlying method is the least well known. Further it seems that this is a missed opportunity to simplify the code as the type hints are associated with the relevant symbols. In contrast the headers must be broken up into parts in order to associate them with the arguments.

This complexity makes it difficult to handle issues such as #1336 .

We would like to move towards having one simple way of specifying types. Type hinting is preferred as it is already part of the Python library.

Phase 1 Prepare for breaking changes

Phase 2 Ensure changes can be implemented

Phase 3 Breaking changes. This should be begun once the deprecation warning has been present for at least one version.

Phase 4 Use the simpler code to add additional support

adam-urbanczyk commented 1 year ago

Additional two requests

  1. Please support non-string syntax for arrays float64[:] on top of 'float64[:]'
  2. Please support type aliases. Currently the compiler treats T='float64[:]' as a static string. If parsing is ambiguous, Python has a standard type to indicate the intent.
EmilyBourne commented 1 year ago

Additional two requests

  1. Please support non-string syntax for arrays float64[:] on top of 'float64[:]'
  2. Please support type aliases. Currently the compiler treats T='float64[:]' as a static string. If parsing is ambiguous, Python has a standard type to indicate the intent.

Hi, thanks for your feedback. It is always useful to hear what users think is useful. Do you mind if I ask about why you want this?

For 1. We are generally aiming to allow Pyccel to translate Python files which would run without Pyccel being installed (this is one of the reasons for deprecating the @types decorator). Presumably non-string syntax would involve importing types from Pyccel? What do you feel is the advantage of this over strings? The only advantage I see is as a workaround for 2. Neither method would work with mypy as far as I am aware.

For 2. Do you have a use case in mind for this? Is the aim to be able to change the types of multiple functions at the same time?

adam-urbanczyk commented 1 year ago

Hi, thanks for your feedback. It is always useful to hear what users think is useful. Do you mind if I ask about why you want this?

For 1. We are generally aiming to allow Pyccel to translate Python files which would run without Pyccel being installed (this is one of the reasons for deprecating the @types decorator). Presumably non-string syntax would involve importing types from Pyccel? What do you feel is the advantage of this over strings? The only advantage I see is as a workaround for 2. Neither method would work with mypy as far as I am aware.

The idea was to use similar typing syntax to one used in projects like numba or lpython. The advantage of the is being more user friendly for users switching from said projects and being able to quickly switch the backed. Alternatively you could consider supporting the native numpy type (np.ndarray[Tuple[:,:],float]), but it seems that the exact syntax is not yet defined: https://github.com/numpy/numpy/issues/22506 , https://github.com/numpy/numpy/issues/16544 .

For 2. Do you have a use case in mind for this? Is the aim to be able to change the types of multiple functions at the same time?

Indeed, quickly switching of the types and making the code more readable.

All in all, I think (2) is much more important than (1). But if you are going to support non-string annotations, it would be nice to use a convention that is already in use.