Nasdaq / data-link-python

A Python library for Nasdaq Data Link's RESTful API
MIT License
428 stars 70 forks source link

add AuthorizedSession #22

Open runawaycoast opened 2 years ago

runawaycoast commented 2 years ago

This pr adds a new way to set ApiConfig and making requests.

    import nasdaqdatalink
    api_config = nasdaqdatalink.ApiConfig()
    api_config.verify_ssl = False
    api_config.api_key = 'somekey'
    data = nasdaqdatalink.get('table/abc', start_date='2022-05-10', end_date='2022-05-10', api_config=api_config)
    from nasdaqdatalink
    api_config = nasdaqdatalink.ApiConfig()
    api_config.verify_ssl = False
    authed_session = nasdaqdatalink.AuthorizedSession(api_config)
    data = authed_session.get('table/abc', start_date='2022-05-10', end_date='2022-05-10')
couture-ql commented 2 years ago

I believe this PR should replace #16.

couture-ql commented 2 years ago

Mind rebasing and force-pushing this branch instead? I'd like to avoid including the merge commit in this series. To keep a cleaner history when we do merge you have two options:

Technically a third option is to squash it all of this series into one commit, and improve the individual git log message. The majority of the commit log entries don't explain why you're making each change. What I'd prefer to see is atomic commits with a commit message explaining why a the change is being made. It doesn't need to be an essay. For examples of good commit messages, look no further than the linux kernel and git itself.

I can review these options with you next week, and get this merged in as I'd like to include this change in the next release.

It appears Eric has already bumped the version and change log, which I try to reserve for the commit tag only. Will change with you both and I suspect we'll put out two versions as I'd rather keep the git tag on the same commit that bumped the version and changelog. when cutting a release, I'd prefer to modify those two files separately in a commit and tag it.

couture-ql commented 2 years ago

The ApiConfig in this context is a bit of a nuisance, especially where read_key() is concerned. I'd like to avoid duplicating the logic.

The module method read_key() should be read as part of the ApiConfig constructor so that users don't need to explicitly call it. We introduced read_key() in __init__.py to provide the convenience of setting up the class attributes when the module is imported. I feel like that any time we create a new ApiConfig instance, read_key() should be performed to help set the state. Callers can modify the instance after the fact, though I'd prefer it were immutable: it would be painful to enforce.

At the end of the day, ApiConfig could be a dataclass that can initialize itself on create, and mutated later. Since we no longer need to support python < 3.7 we can use typing instead of checking isinstance(). I'd rather users know they set the wrong thing at runtime, than we passively set a default for them and they end up having a hard time figuring out why their state is different than what they expect. This would only be true if they wanted to modify the base domain or some other property, but accidentally passed in some wrong object.

When AuthorizedSession instances are created, you allow callers to provide the api_config or we'll generate one for you: this is great. However, we still need to perform read_key() to set the state appropriately. I don't see when that's happening, so I expect that the default object won't have tried to figure out the API Key for convenience. Since the class attributes end up being used / referenced instead, I think this subtly gets missed.

I made a pass at doing some refactoring of the api_config.py from a module to a class with a lot of behaviour / helper methods instead. However, it appears I've broken get_config_from_kwargs() method you introduced; I'll need to fix that.

couture-ql commented 2 years ago

Ah, I should have paid closer attention to how you're using get_config_from_kwargs() to pass around the ApiConfig class and relying on the class attributes to pass that around state to avoid updating method signatures.

I've made this business a bit more messy. Let me think about it a bit more. I'll roll this change out of this series and keep it to myself for now.

jjmar commented 2 years ago

Code seems 👍 to me. Even with this code being backwards compatible with the old way of making requests (1 call per session), I feel like we should release it as a major. This way people are more aware about authorized sessions change, and how it can make their use of the package more performant

couture-ql commented 2 years ago

I agree. While not a breaking change, it would require people to modify their scripts to take advantage of this feature.

We should add some coding examples and perhaps a future PR can address it. While I'm happy to see the README updated, the authorizedsession() accepts an api_config keyword argument, and we should provide the context for why that's useful. Despite the majority of this SDK relying too heavily on global state of ApiConfig's class attributes, we can highlight when it might be useful to not use the defaults.

Include an example directory with the use-cases in mind would be helpful.