Code style/check improvement ideas

BAMresearch / probeye

A general framework for setting up parameter estimation problems.

MIT License

5 stars 5 forks source link

Code style/check improvement ideas #40

Closed rozsasarpi closed 2 years ago

rozsasarpi commented 2 years ago

@aklawonn @TTitscher, I wanted to hear your thoughts on these suggestions:

1) Adding type hinting to probeye code, this would have the following advantages (the order reflect their importance to me):

PyCharm and comparably smart IDEs can do type checking and catch errors even without running the code.
Easier to understand what a function or class does compared to reading the types from the docstring.
One does not need to type the type in the documentation (documentation generators can use this information to generate good looking documentation).

2) Follow (and possible enforce) a coding style and apply code checks (related to #25):

black, isort, flake8, mypy, etc. (or their subset).
these could be added to the .pre-commit-config.yaml and run with the CI pipeline as well.

TTitscher commented 2 years ago

Generally, coming from C++, I actually prefer static type systems due to all the reasons you listed. It is just that python is not a statically typed language. And at some points, it is really useful: Why annotate some mathematical functions to take numpy.ndarray when they would equally work with a tf.Tensor? In C++, you would create an abstract vector class and could implement it for various math backends. So the big advantage of python for me was to get rid of this "boilerplate". And it turns out, if we wanted to support this "static duck typing", e.g. via protocols, we are back to the boilerplate. On the other hand, the duck typing cases in probeye may be limited. You decide.
I support that. It could "just" be CI pipeline check, that fails, if the code is formatted incorrectly. But maybe that discussion is better suited for #25

rozsasarpi commented 2 years ago

Why annotate some mathematical functions to take numpy.ndarray when they would equally work with a tf.Tensor

If you annotate them in the docstring, which is the case for now as far as I can tell, I think it is more convenient to use type annotations.

from typing import Union

matrix = Union[np.ndarray, tf.Tensor]

def myfun(x: matrix) -> matrix:
    return 2*x

TTitscher commented 2 years ago

And then torch.Tensor and scipy.sparse.csr_matrix comes along... :) With those protocols, you could actually formulate the real intention of

def myfun(x: multiplicable_by_scalar)
   return 2*x

But as I tried to express above: I generally like (the benefits of) static types and the duck typing cases within probeye may be limited or not existent.

aklawonn commented 2 years ago

For me, one small thing that bothers me about type hinting is that the information for one argument is visually split into two places. Especially in cases when several arguments are given, I feel like my eyes are always jumping back and forth, while with common docstrings, everything is visually in one place. Another annoying thing is that we'd have to put some work into reformulating all the function headers and doctrings. But if the majority prefers type hinting, I won't stand in the way.
I think it it's a good idea to do some code-style checking. Do you have a style you prefer?

TTitscher commented 2 years ago

A quick web search resulted in https://github.com/chadrik/doc484, a tool to automatically convert docstrings to type annotations. Maybe there are more popular ones out there. Or is it even a feature in pycharm?

rozsasarpi commented 2 years ago

information for one argument is visually split into two places

That's a good point. Still I have a mild preference for type hinting, but of course I can live without it as well.

Do you have a style you prefer?

I would go with the most commonly used one(s) to make it familiar to the largest number of people, so they are not hindered in contributing. I think black and flake8 are the most common ones.