cardano-community / koios-artifacts

Artifacts for https://koios.rest and https://api.koios.rest websites
Creative Commons Attribution 4.0 International
20 stars 25 forks source link

Koios pools query returns null ( bug / feature? ) #109

Closed Jack-0 closed 1 year ago

Jack-0 commented 2 years ago

Reference https://github.com/cardano-community/guild-operators/issues/1465

There is still an issue here. I've been writing an API that checks pools unfortunately some of the data is null for certain tickers. Including that same ticker mentioned above...

Reproduce

  1. fetch all data from https://api.koios.rest/#get-/pool_list
  2. Notice that some fields are null
  3. here are some pools that have a null ticker
high: pool1tay8z4sq4a4gmyhnygyt0t5j84z8epwjra06wq28jnnmschkkuu 
tcatl: pool1gduk8s5stt3uej5nl9vs7marx8jjyeadwz3505gzvuqey2lch35 
zebra: pool16sm29hg3qrk5lyp74qcljt38l86v5zvg95395h9gqrn6qe97d9z 
maple: pool1xpfe5q3v3axrjdc8h38taaa93frq3m9pfewxk46x4r6jgy2yj5n 

Is this expect behavior for Koios?

Context

I know it maybe possible to find this data from pool history using the pool bech32. However, the challenge I face is I want to get a pools info when only knowing it's ticker. If some tickers are null this is not achievable as far as I am aware with Koios.

  1. I query all ~3000 pools.
  2. I filter the list by ticker
  3. As a few entries are null, I never find the target pool

This could be classed as a feature if this is the expected behavior of Koios. However, other sites/api's still display the pools tickers the only issue I found was with [high] on Cardanoscan.io

If you don't classify this as a bug could this be turned into a feature request. I'm opening this ticket on this issue again for context.

Originally posted by @Jack-0 in https://github.com/cardano-community/guild-operators/issues/1465#issuecomment-1272616685

dostrelith678 commented 2 years ago

However, the challenge I face is I want to get a pools info when only knowing it's ticker.

One thing to note here is that the pool ticker is not a unique identifier for a pool so I think this method is not ideal, although querying one ticker shared by multiple pools could be made to return all the results.

But you are right in that some pools (like the ones you listed) return null ticker from /pool_list, even though they return a valid ticker from /pool_info or pool_updates. The /pool_list endpoint uses only the pool_info_cache table so this problem can be tackled either by correcting (optimising) that caching, or by adding logic for the endpoint not to rely solely on that table.

Jack-0 commented 2 years ago

One thing to note here is that the pool ticker is not a unique identifier for a pool so I think this method is not ideal, although querying one ticker shared by multiple pools could be made to return all the results.

I understand this. This is more for UX (Cardano needs more usability focused design imo).

But you are right in that some pools (like the ones you listed) return null ticker from /pool_list, even though they return a valid ticker from /pool_info or pool_updates. The /pool_list endpoint uses only the pool_info_cache table so this problem can be tackled either by correcting (optimising) that caching, or by adding logic for the endpoint not to rely solely on that table.

Hopefully this could be fixed with optimizing the cache as mentioned here.

If the null fix is not achievable on the original route. A feature such as the following would be desired.

route

api/tickers

json input (list of tickers)

{"tickers":["9000","MAPLE","hIgH"]}

returns

[{
  "ticker":"<pool_ticker>"
  "poolId":"<poolBech32Id>"
}]
rdlrt commented 2 years ago

The outcome seen here is a result of 2 different issues, while in addition (continuing from previous thread) - the consideration is for 3 issues:

  1. In dbsync 13 , certain pools started showing "Invalid URL" during check from pool_offline_data , this was not replicable in earlier versions - and has been fixed upstream on master, but not yet part of a release. The impact is for small number of pools, and the list of pools impacted are different between instances. Expected Resolution: Once a release is marked, instances can be upgraded to resolve the pool offline data.
  2. As @dostrelith678 mentioned, the pool_list relies on pool_information_cache - which checks only the latest pool update (expected). But due to issue # 1, some pools are currently showing null for metadata , as they're running into input-output-hk/cardano-db-sync#1270 . Any endpoints that fetch latest metadata data are similarly showing as null (and is different between instances). However, if we're to look at fallback for pool_list/pool_metadata endpoints, this has to be done carefully - as we cannot lose private pools from the endpoint either (and thus, cannot use pod.json IS NOT NULL filter). Doing a fallback for all pools might not be the best way ahead from performance point of view, but could be an interim solution if done for public pools. Expected Resolution: The SQL queries to be optimised where possible
  3. [Not impacted in this list of examples] If a pool registers/updates their metadata, pool_offline_cache needs a while to get added, and the time could be multiplied if the source pool starts with an invalid metadata. Action: None - There is little we can do for such pools, but thankfully such occurrences have 'seemed' to have gone down quite a bit in recent months.

As regards the endpoint for JSON list of tickers, that's an easy addition - but it'd be equivalent of https://api.koios.rest/api/v0/pool_list?ticker=in.("MAPLE","HIGH","TCATL","ZEBRA") , we can look at it once issue # 2 is fixed. Note that some pools could still fail due to point # 1, and in future # 3.

We'll be able to confirm some of the details with a forked dbsync version that contains the fix upstream, and is being run by @reqlez (still ~40 epochs to go)

rdlrt commented 1 year ago

The above mentioned tests from Boris' fork (cherry picked dbsync commits) look promising, the fixes made upstream to dbsync are now showing all the mentioned pools - awaiting a release (or even a dedicated tag/branch on IO repo) on dbsync now

rdlrt commented 1 year ago

Update - still waiting to hear back on marking next tag/release on dbsync

reqlez commented 1 year ago

Update - still waiting to hear back on marking next tag/release on dbsync

I have not seen any additional commits into that branch 13-0-6 branch, i'm assuming they are getting some obscure release manager on it lol