santiment / sanpy

Santiment API Python Client
MIT License
96 stars 29 forks source link

batch call vs get #135

Closed phisanti closed 3 years ago

phisanti commented 3 years ago

I have been using your API for a couple of weeks and I have seen that the call limit is 20K items. I see that this also applies when using a batch call as the total number of elements gets aggregated. That is, it's possible to call 3 separate metrics of 20K items, one at a time but it is not possible to call a batch of the three 20K metrics at the same time:

from san import Batch
import san

# This will work
my_metrics = ['daily_active_addresses', 'active_addresses_24h', 'adjusted_price_daa_divergence']
separate_metrics = []
for i in my_metrics:
    metric_i=san.get(
        i + "/santiment",
        from_date="2018-06-01",        
        to_date="2019-06-05",
        interval="1h"
        )
    separate_metrics.append(metric_i)

batch = Batch()
# This will trigger the limit call
for i in my_metrics:

    batch.get(
        i + "/santiment",
        from_date="2018-06-01",
        to_date="2019-06-05",
        interval="1d"
        )

combined_metrics = batch.execute()

Therefore, I was wondering if each call from the batch counts as 1 call for the limit count or as many as metrics are in the call (in this case 3).

IvanIvanoff commented 3 years ago

Hi, thanks for using sanpy!

First, I want to mention something that is not immediately obvious - the complexity limit changes by using a higher subscription plan. Due to the GraphQL implementation details (you can have only a single complexity limit) it is implemented by reducing the complexity weight with higher plans as seen here. So a pro plan has essentially 5 times higher limits.

The batching works by adding the API calls to the same GraphQL request, so the end request looks something like this - every batch.get results in a separate query that is executed separately and counted separately, so the end API calls count in both cases is exactly equal. As this is a single GraphQL request that consists of multiple queries, the complexity is applied on the whole request (the sum of all its parts).

With this said, the batching is in some sense a syntax sugar for sanpy and you won't spend more API calls if you don't use it.

There are actual ways to batch data that will spare you API calls, but they are still not exposed as a separate function in sanpy as we have not seen request for them. For example in a truly single API call you can fetch the data for many assets and a single metric (many metrics for a single asset is much harder as different metrics have different restrictions). Sanpy allows you to execute raw GraphQL requests like this one by doing this:

san.graphql.execute_gql("""
{
    getMetric(metric: "price_usd") {
      timeseriesDataPerSlug(
        selector: {watchlistSlug: "stablecoins-usd"},
        from: "utc_now-7d",
        to: "utc_now"
        interval: "1d"){
          datetime
          data {slug value}
        }
    }
}
""")

This code will fetch 7 day timeseries data for every asset in the stablecoins-usd watchlist.

phisanti commented 3 years ago

Thank you very much, really helpful answer.

If possible, I would like to ask one last question. I have seen that we have a python wrapper to request the time left in case an API call fails (san.rate_limit_time_left()). However, I am not sure if there is any function to find how many calls left do I have.

IvanIvanoff commented 3 years ago

Sadly, it seems like we're missing this functionality but it should be added soon (ping @spiderjako)

Until then, I guess the closest thing you can do is to monitor your API calls count over time with this api. You can execute that from sanpy as shown above with the execute_gql or you can directly see it in the browser (only if you're logged in sanbase in that same browser).

phisanti commented 3 years ago

Thank you very much! Issue closed.

marc-moreaux commented 3 years ago

Hello,

I was searching for something similar to @phisanti for the same reasons as him. It appears that your solution (which I find great) seems to not be working for the metric "twitter_followers" would you have any insight on why ?

{
  getMetric(metric: "twitter_followers") {
    timeseriesDataPerSlug(
      selector: {slugs: [
        "ethereum",
        "bitcoin"
      ]},
      from: "utc_now-7d",
      to: "utc_now"
      interval: "1d"){
        datetime
        data {slug value}
      }
  }
}

Thank you very much :)

spiderjako commented 3 years ago

Hey, just so you know, I've added two new SanPy functions - api_calls_made() and api_calls_remaining(). Feel free to check them out!