alinski29 / Stonks.jl

Julia library for standardizing financial data retrieval and storage from multiple APIs.
MIT License
53 stars 3 forks source link

Avoid implicit clients at call site #9

Closed savq closed 1 year ago

savq commented 2 years ago

Currently, if the client is omitted from the function call arguments (e.g. to get_price), it'll be constructed implicitly using environment variables.

I understand that this was written for convenience, and so that the API key can stay secret. However, this makes unclear which client will be created, and could potentially cause reproducibility issues.

Instead of

prices = get_price("AAPL")  # the whole client is auto-magically inferred

I'd suggest

client = YahooClient()  # API key is still inferred, but the client is now explicit
prices = get_price("AAPL", client)

Where the YahooClient constructor with no arguments would check ENV.

This also means that if the env variable is absent, the error will point to the constructor, not the following function calls.

(I'd also suggest having the client be the first argument of the function call, OOP style, but that's besides the point)

PS. Just when I needed to get a time series for an assignment I saw this package on Reddit. Very convenient! Thanks for working on this.

alinski29 commented 2 years ago

Hello @savq ,

Thanks for the feedback and opening this issue. :)

It's on the agenda to come up with a cleaner way to handle multiple clients.
Generally, I'd like to avoid client as a parameter if possible. If you wish to provide the client as a parameter, you can.

I agree it makes more sense to have the client as the first argument to any function returning data. The reason why I put it as the the second was that the code wouldn't compile if the first argument was Union{APIClient,Nothing}=nothing because of LoadError: syntax: optional positional arguments must occur at end around. I need 2 separate methods, one with client as an argument and the other one without. But I think that's an okay compromise and will implement it this way.

Btw, if client can't be resolved you get returned a DataClientError: https://github.com/alinski29/Stonks.jl/blob/main/src/APIClients.jl#L139.

If you only have a token for one provider, there should be no issue, I'd prefer to skip the client argument in this case if possible. The tricky aspect is when you can build multiple clients, which one will be used. Currently that's decided based on an attribute of the APIResource ,called rank_order ,but that's a bit to hidden from the user.

Another idea would be to create a client pool where you provide a list of clients and the priority is based on the list index. That would also help in the scenario where you get hit the max requests on an API to retry the request with the next client in the pool. The question here is if this client pool should be explicit or not.

prices = get_price(pool, "AAPL")

- implicit, I don't have any idea how to do it without a global variable
```julia
ClientPool = Vector{Stonks.APIClient}
global client_pool = ClientPool[]
append!(client_pool, [YahooClient("abc"), AlphavantageJSONClient("def")]
prices = get_price( "AAPL") #client = first element client_pool

Again, thanks for opening this issue, would love to hear your opinion again.

savq commented 2 years ago

Ahhh, I didn't know it would try different clients if one didn't have a resource.

Another idea would be to create a client pool where you provide a list of clients and the priority is based on the list index.

Yes, this seems much more reasonable. Something similar to what the Plots ecosystem does, where you set the client/backend with a function, and all the subsequent calls would use that client. This would also mean that

  1. The user gets to see which clients will be used
  2. Calls without an explicit client argument won't construct a new client every time

The question here is if this client pool should be explicit or not.

I think exposing a closure to the user that handles the client pool internally would be a simple interface:

using Stonks

set_stonks_clients(YahooClient(), AlphavantageJSONClient())

get_price("AAPL") # will try both clients

set_stonks_clients(YahooClient())

get_price("AAPL") # will only try the Yahoo API

That way the client pool remains scoped within the module.