transientskp / pyse

Python Source Extractor
BSD 2-Clause "Simplified" License
11 stars 5 forks source link

Reimplement `DataAccessor` for easier testing & validation #53

Open suvayu opened 4 months ago

suvayu commented 4 months ago

From Mypy:

sourcefinder/accessors/dataaccessor.py:90: error: "DataAccessor" has no attribute "tau_time"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:91: error: "DataAccessor" has no attribute "freq_eff"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:92: error: "DataAccessor" has no attribute "freq_bw"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:93: error: "DataAccessor" has no attribute "taustart_ts"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:94: error: "DataAccessor" has no attribute "url"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:95: error: "DataAccessor" has no attribute "beam"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:96: error: "DataAccessor" has no attribute "beam"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:97: error: "DataAccessor" has no attribute "beam"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:98: error: "DataAccessor" has no attribute "centre_ra"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:99: error: "DataAccessor" has no attribute "centre_decl"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:100: error: "DataAccessor" has no attribute "pixelsize"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:101: error: "DataAccessor" has no attribute "pixelsize"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:112: error: "DataAccessor" has no attribute "wcs"  [attr-defined]

This is because the base class uses a custom metaclass, and the base class doesn't really have these defined, but some methods use them (e.g. extract_metadata).

This kind of enforcement of API behaviour is much more cleanly achieved by using either of the following,

HannoSpreeuw commented 4 months ago

How can I reproduce those Mypy errors from bash?

E.g. hatch run lint:mypy will not show exactly these errors.

suvayu commented 4 months ago

Seems to work for me:

$ hatch run lint:mypy
sourcefinder/accessors/requiredatts.py:64: error: Need type annotation for "required_atts" (hint: "required_atts: set[<type>] = ...")  [var-annotated]
sourcefinder/utility/containers.py:55: error: Signatures of "__iadd__" and "__add__" are incompatible  [misc]
sourcefinder/accessors/dataaccessor.py:90: error: "DataAccessor" has no attribute "tau_time"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:91: error: "DataAccessor" has no attribute "freq_eff"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:92: error: "DataAccessor" has no attribute "freq_bw"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:93: error: "DataAccessor" has no attribute "taustart_ts"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:94: error: "DataAccessor" has no attribute "url"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:95: error: "DataAccessor" has no attribute "beam"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:96: error: "DataAccessor" has no attribute "beam"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:97: error: "DataAccessor" has no attribute "beam"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:98: error: "DataAccessor" has no attribute "centre_ra"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:99: error: "DataAccessor" has no attribute "centre_decl"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:100: error: "DataAccessor" has no attribute "pixelsize"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:101: error: "DataAccessor" has no attribute "pixelsize"  [attr-defined]
sourcefinder/accessors/dataaccessor.py:112: error: "DataAccessor" has no attribute "wcs"  [attr-defined]
sourcefinder/accessors/lofarhdf5image.py:62: error: "LofarHdf5Image" has no attribute "header"  [attr-defined]

What do you see exactly? Maybe you can explicitly specify the source directory: hatch run lint:mypy sourcefinder?

HannoSpreeuw commented 3 months ago

hatch run lint:mypy gives me

sourcefinder/accessors/requiredatts.py:64: error: Need type annotation for "required_atts" (hint: "required_atts: set[<type>] = ...")  [var-annotated]
sourcefinder/accessors/lofaraccessor.py:2: error: Library stubs not installed for "six"  [import-untyped]
sourcefinder/accessors/lofaraccessor.py:5: error: Unsupported dynamic base class "with_metaclass"  [misc]
sourcefinder/utility/coordinates.py:12: error: Library stubs not installed for "pytz"  [import-untyped]
sourcefinder/utility/containers.py:55: error: Signatures of "__iadd__" and "__add__" are incompatible  [misc]
sourcefinder/accessors/dataaccessor.py:3: error: Library stubs not installed for "six"  [import-untyped]
sourcefinder/accessors/dataaccessor.py:3: note: Hint: "python3 -m pip install types-six"
sourcefinder/accessors/dataaccessor.py:3: note: (or run "mypy --install-types" to install all missing stub packages)
sourcefinder/accessors/dataaccessor.py:3: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
sourcefinder/accessors/dataaccessor.py:9: error: Unsupported dynamic base class "with_metaclass"  [misc]
sourcefinder/accessors/fitsimage.py:6: error: Library stubs not installed for "dateutil.parser"  [import-untyped]
sourcefinder/accessors/fitsimage.py:6: note: Hint: "python3 -m pip install types-python-dateutil"
sourcefinder/accessors/fitsimage.py:6: error: Library stubs not installed for "dateutil"  [import-untyped]
sourcefinder/accessors/fitsimage.py:8: error: Library stubs not installed for "pytz"  [import-untyped]
sourcefinder/accessors/fitsimage.py:8: note: Hint: "python3 -m pip install types-pytz"
sourcefinder/accessors/fitsimageblob.py:45: error: Signature of "_get_header" incompatible with supertype "FitsImage"  [override]
sourcefinder/accessors/fitsimageblob.py:45: note:      Superclass:
sourcefinder/accessors/fitsimageblob.py:45: note:          def _get_header(self, hdu_index: Any) -> Any
sourcefinder/accessors/fitsimageblob.py:45: note:      Subclass:
sourcefinder/accessors/fitsimageblob.py:45: note:          def _get_header(self, hdulist: Any, hdu_index: Any) -> Any
sourcefinder/accessors/fitsimageblob.py:48: error: Signature of "read_data" incompatible with supertype "FitsImage"  [override]
sourcefinder/accessors/fitsimageblob.py:48: note:      Superclass:
sourcefinder/accessors/fitsimageblob.py:48: note:          def read_data(self, hdu_index: Any, plane: Any) -> Any
sourcefinder/accessors/fitsimageblob.py:48: note:      Subclass:
sourcefinder/accessors/fitsimageblob.py:48: note:          def read_data(self, hdulist: Any, hdu_index: Any, plane: Any) -> Any
Found 12 errors in 7 files (checked 30 source files)
HannoSpreeuw commented 3 months ago

And the same for hatch run lint:mypy sourcefinder, but I guess this is caused by an environment defined through a Pipfile in a top level directory that I use. This is activated through pipenv shell. The environment defined from that Pipfile will likely be different than the one from pyproject.toml.

I guess I should activate the virt env defined by pyproject.toml using hatch or poetry.

suvayu commented 3 months ago

You should do this outside of any other environments you have. hatch handles everything now. I'll write some instructions later, but essentially hatch run [env:]command so to run the scripts under scripts/, you can omit the environment (since that's the default), but need to specify for the rest.

hatch run scripts/pyse [--options]
hatch run test:pytest [tests/test_iwanttorun.py -k match_string]
hatch run lint:mypy [--options]
hatch run lint:flake8 [--options]

I think you need to install type stubs for the dependencies. Try this: hatch run lint:mypy --install-types --non-interactive, subsequently you can leave out the other flags.

suvayu commented 3 months ago
$ hatch env show 
             Standalone             
┏━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Name    ┃ Type    ┃ Dependencies ┃
┡━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ default │ virtual │              │
├─────────┼─────────┼──────────────┤
│ test    │ virtual │ pytest       │
│         │         │ pytest-cov   │
├─────────┼─────────┼──────────────┤
│ lint    │ virtual │ black        │
│         │         │ flake8       │
│         │         │ mypy         │
│         │         │ ruff         │
└─────────┴─────────┴──────────────┘
HannoSpreeuw commented 3 months ago

Thanks. hatch run lint:mypy --install-types --non-interactive works and reproduces those errors (and five more).

However, hatch run test:pytest test gives these errors:

sourcefinder/__init__.py:1: in <module>
    from ._version import __version__
E   ModuleNotFoundError: No module named 'sourcefinder._version'
HannoSpreeuw commented 3 months ago

But works fine once part of the e711711 commit is reverted.

HannoSpreeuw commented 3 months ago

This is because the base class uses a custom metaclass, and the base class doesn't really have these defined, but some methods use them (e.g. extract_metadata).

I guess you are suggesting that a solution would involve removing RequiredAttributesMetaclass, since this kind of machinery is an overkill?

Removing it may not only affect unit tests in PySE, but also in TraP.

HannoSpreeuw commented 3 months ago

@suvayu How about a solution using the property decorator?

suvayu commented 3 months ago

It depends on the kind of validation you want to do. If they are just type based, I would use dataclasses. However it is a bit more complex, then maybe property is better. I would also recommend looking at attrs once.

suvayu commented 3 months ago

But works fine once part of the e711711 commit is reverted.

Please don't revert this commit. I could recommend the correct resolution if you post the error. My guess is, because this changes the build a bit, you just need to run the build step once. Try this hatch build --hooks-only

HannoSpreeuw commented 3 months ago

Indeed after issuing hatch build --hooks-only with the latest __init__.py, i.e. with from ._version import __version__ hatch run test:pytest yields 106 unit tests passed (1 skipped), i.e. the ModuleNotFoundError: No module named 'sourcefinder._version' error has vanished.

suvayu commented 3 months ago

(1 skipped)

If you want to fix this, clone https://github.com/transientskp/trap-test-data, and put the casatable directory under test/data/.

I'll set this up properly when I have more time.

HannoSpreeuw commented 3 months ago

That worked indeed!

HannoSpreeuw commented 3 months ago

The motivation for adding the metaclass=RequiredAttributesMetaclass argument to the DataAccessor class definition can be found here:

This results in much cleaner code than use of @abstractproperty

abstractproperty has been deprecated since 3.3:

It is now possible to use property, property.getter(), property.setter() and property.deleter() with abstractmethod(), making this decorator redundant.

I need to think about this longer, since this means that the precursor of property was used until nine years ago and replaced by this metaclass=RequiredAttributesMetaclass argument.

suvayu commented 3 months ago

Hi @HannoSpreeuw, I think what would help in the decision is to get an overview of how derived classes are being used, and more importantly, what functionality does this base class offer?

HannoSpreeuw commented 1 month ago

This was closed automatically, because of the line Should also fix #53 in PR #66, but after reconsideration closing of this issue was not supposed to happen, see the discussion in that PR.