felipeangelimvieira / prophetverse

A multiverse of Prophet models for timeseries
https://prophetverse.com
Apache License 2.0
33 stars 3 forks source link

[MNT] Lint and split effects module #72

Closed felipeffm closed 5 months ago

felipeffm commented 5 months ago

This PR increases the modularity of effects.py module, perform linting with pre-commit and update tests

fkiraly commented 5 months ago

Seeing this PR, I moved some relevant discussion from sktime into this repo:

https://github.com/felipeangelimvieira/prophetverse/issues/73

If you want nested set_params and tuning etc, you could inherit from BaseObject instead of abc.

On a side note, we had some odd interaction between 3.9 codestyle and abc which we still have not fully understood, likely related to the decorators: https://github.com/sktime/sktime/pull/6537, in case it might provide additional information.

felipeangelimvieira commented 5 months ago

Thank you @felipeffm !

Please check the function additive_effect, is it not missing the sum of trend?

Regarding the additive effect, it is not scaled by the trend, as observed in the multiplicative version. The outputs of the additive effects are already scaled to the timeseries. The trend is added to the mean in models.py, and then we add the other effects.

On a side note, we had some odd interaction between 3.9 codestyle and abc which we still have not fully understood

Thank you for the information Franz. @felipeffm was inquiring why some classes in Prophetverse use ABC, while others simply raise an Exception. In Prophetverse's sktime module, I followed the pattern used in the BaseForecaster class (that employs raise), and in the Effects module, I adopted ABC. However, for consistency, it would be better to stick with one approach or the other. If sktime decide to say goodbye to ABC I think it's worth following it.

fkiraly commented 5 months ago

random review comment, since I'm noticing the import * - I asked one of the AI systems: "is import * in python good or bad practice or neither?"

Answer:


Using import * in Python is generally considered bad practice for the following reasons:

  1. Namespace Pollution: It can lead to conflicts with existing names in the current namespace.
  2. Readability and Maintainability: It makes it unclear which names are being used from the module, making the code harder to read and maintain.
  3. Namespace Collisions: Identical names from different modules can cause unexpected behavior and bugs.
  4. Debugging Difficulties: It can be hard to trace the origin of specific functions or variables, complicating debugging and refactoring.

When import * Might Be Appropriate

  1. Interactive Sessions: In interactive Python sessions or Jupyter notebooks, where brevity is important.
  2. *Modules Designed for `import **: Some modules are designed withimport *` in mind, such as those defining numerous constants or configurations.

Best Practices

By following these best practices, you ensure that your code remains clean, readable, and maintainable.

fkiraly commented 5 months ago

(so, basically: fine and may be preferable in jupyter notebook analyses etc, but not recommended when building a package)

felipeangelimvieira commented 5 months ago

Thank you for pointing this out, Franz. I have an opinion about this but agree that it would be good to avoid that in this PR and remove from other modules in future PRs (contrary to what I suggested before).

I believe that * is not good practice, especially in notebooks, and whenever the imported content is used in the file since it makes it hard to figure out where each function and class comes from, and may import unexpected objects, creating namespace collisions, as you explained.

However, in __init__.py files, I find that it is useful when used with __all__ for modules where the majority of users are interested only in specific module content (for example, effects.py). The reason is that, on the development side, when a file in a module is updated with new classes, it isn't necessary to update __init__.py.

But the time needed to update an __init__.py should be some seconds, and actually, I agree that it would be good to avoid that pattern since it is not used by major libraries (sklearn, pandas, sktime, etc). Using explicit imports even in init files makes it more readable.

felipeangelimvieira commented 5 months ago

As always thank you for the help Franz!

fkiraly commented 5 months ago

hard to figure out where each function and class comes from, and may import unexpected objects, creating namespace collisions, as you explained.

Well, actually chatGPT 4, but yes.

The reason is that, on the development side, when a file in a module is updated with new classes, it isn't necessary to update __init__.py.

I also used to think that, yes. Though, as sktime got more complex, it turns out that it leads to combinatorial explosion of imports, and extremely hard to diagnose circular imports.

For instance, from module.module2 import sth imports anything that module.__init__ also imports. So any import that goes beyond module now imports everything from within it. That includes imports from external libraries.

At one point - not with import * but similar generous designs - we ended up in a situation where certain imports would take 10 seconds for the first time after environment init. Tracking it down, there were chains of dozens of imports from completely unexpected locations, that were not needed by most of the imports. Cutting this back was very unpleasant but necessary...

felipeffm commented 5 months ago

Adopted both, the declaration of public interface with __all__= and explicit imports.

codecov[bot] commented 5 months ago

Codecov Report

Attention: Patch coverage is 93.63636% with 7 lines in your changes missing coverage. Please review.

Project coverage is 93.19%. Comparing base (c2d4ec3) to head (58c4d10).

Files Patch % Lines
src/prophetverse/effects/base.py 75.00% 7 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #72 +/- ## ========================================== + Coverage 93.11% 93.19% +0.08% ========================================== Files 19 24 +5 Lines 1016 1029 +13 ========================================== + Hits 946 959 +13 Misses 70 70 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.