polygon-io / client-python

The official Python client library for the Polygon REST and WebSocket API.
https://polygon-api-client.readthedocs.io/
MIT License
744 stars 204 forks source link

Replace modelclasses with dataclasses #686

Open SuperMaZingCoder opened 2 weeks ago

SuperMaZingCoder commented 2 weeks ago

Problem

modelclass is an uncessary abstraction for dataclass, and it makes the codebase harder to navigate.

All of the relevant behavior between modelclass and dataclass is the same. The difference is that modelclass allows its objects to take extra fields (more on this below). I think that this is a feature that should be removed.

Moreover, linters can't understand the functionality that modelclass implements the way they can with dataclass. Here's an example:

image

image

Equivalent Behavior Where It Matters (PDF)

Here's a gist that demonstrates this. The first half uses a dataclass rather than a modeclass, and the second half uses modelclass, copied from modelclass.py. Each class functions in the same way regardless of whether the @dataclass or @modelclass decorator is used.

Here's a pdf version that's probably easier to read.

Solution

Replacing all references to modelclasswith dataclass causes a couple tests to fail in regards to that extra field. One test that fails:

def test_extra_field(self):
        a = Agg(
            open=1.5032,
            high=1.5064,
            low=1.4489,
            close=1.4604,
            volume=642646396.0,
            vwap=1.469,
            timestamp=1112331600000,
            transactions=82132,
        )
        b = Agg(
            open=1.5032,
            high=1.5064,
            low=1.4489,
            close=1.4604,
            volume=642646396.0,
            vwap=1.469,
            timestamp=1112331600000,
            transactions=82132,
            extra_field=22,
        )
        self.assertEqual(a, b)

This causes:

======================================================================
ERROR: test_extra_field (test_modelclass.ModelclassTest.test_extra_field)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/super/Documents/Code/client-python/test_rest/test_modelclass.py", line 17, in test_extra_field
    b = Agg(
        ^^^^
TypeError: Agg.__init__() got an unexpected keyword argument 'extra_field'

======================================================================

a and b cannot be equivalent in this test because before their equivalence is even checked, Agg fails to be instantiated with an extra_field. I think that this is better functionality, though. I don't see why it would be useful to allow extra fields to be passed without raising errors. It seems like it could actually slow down development.

The linter wouldn't catch wvap being passed instead of vwap, for example. For dataclass, this isn't a problem:

image

Of course, as above, modelclass instances can't be linted at all:

image

Because I believe that an extra_field should make instantiation fail, I changed the test so that it passes when instantiating Agg with extra_field raises a TypeError:

def test_extra_field(self):
    with self.assertRaises(TypeError):
        b = Agg(
            open=1.5032,
            high=1.5064,
            low=1.4489,
            close=1.4604,
            volume=642646396.0,
            vwap=1.469,
            timestamp=1112331600000,
            transactions=82132,
            extra_field=22,
        )

Similarly,

  File "/Users/super/Documents/Code/client-python/polygon/rest/models/contracts.py", line 34, in from_dict
    return OptionsContract(
           ^^^^^^^^^^^^^^^^
TypeError: OptionsContract.__init__() got an unexpected keyword argument 'size'

======================================================================
ERROR: test_list_options_contracts (test_contracts.ContractsTest.test_list_options_contracts)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/super/Documents/Code/client-python/test_rest/test_contracts.py", line 25, in test_list_options_contracts
    contracts = [c for c in self.c.list_options_contracts()]
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/super/Documents/Code/client-python/polygon/rest/base.py", line 233, in _paginate_iter
    yield deserializer(t)
          ^^^^^^^^^^^^^^^
  File "/Users/super/Documents/Code/client-python/polygon/rest/models/contracts.py", line 34, in from_dict
    return OptionsContract(
           ^^^^^^^^^^^^^^^^
TypeError: OptionsContract.__init__() got an unexpected keyword argument 'size'

======================================================================

This occurs because all tests for OptionsContract implement an unused size attribute. size is not even a field that is returned by the Polygon API; it's useless.

The from_dict method of OptionsContract also implements this size attribute, despite it not being defined under the actual class:

@modelclass
class OptionsContract:
    "OptionsContract contains data for a specified ticker symbol."
    additional_underlyings: Optional[List[Underlying]] = None
    cfi: Optional[str] = None
    contract_type: Optional[str] = None
    correction: Optional[str] = None
    exercise_style: Optional[str] = None
    expiration_date: Optional[str] = None
    primary_exchange: Optional[str] = None
    shares_per_contract: Optional[float] = None
    strike_price: Optional[float] = None
    ticker: Optional[str] = None
    underlying_ticker: Optional[str] = None

To solve this, I removed all references to this size field.

Lastly,

======================================================================
ERROR: test_list_universal_snapshots (test_snapshots.SnapshotsTest.test_list_universal_snapshots)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/super/Documents/Code/client-python/test_rest/test_snapshots.py", line 39, in test_list_universal_snapshots
    details=UniversalSnapshotDetails(
            ^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: UniversalSnapshotDetails.__init__() got an unexpected keyword argument 'underlying_ticker'

----------------------------------------------------------------------

This happens for the same reason as the errors for size above. However, because the underlying_ticker field seems useful, I changed UniversalSnapshotDetails to take one:

@dataclass
class UniversalSnapshotDetails:
    """Contains details for an options contract."""

    contract_type: Optional[str] = None
    exercise_style: Optional[str] = None
    expiration_date: Optional[str] = None
    shares_per_contract: Optional[float] = None
    strike_price: Optional[float] = None
    underlying_ticker: Optional[str] = None # added

If this should be removed, let me know and I will instead remove underlying_ticker from the test.

Pull Request

I'm putting up a pullrequest that removes all references to modelclass and replaces them with the builtin dataclass. I also changed the tests in the ways outlined above. This removes the possibility for extra, unecessary, fields to be used in the instantiation of these models. This makes catching bugs easier, as any typos will be caught by the linter and will also cause the code to fail.

I think this is a good change for code quality.