holaplex / indexer

Index Solana data using a Geyser plugin (downstream service cluster)
https://holaplex.com
GNU Affero General Public License v3.0
162 stars 38 forks source link

Research The Feasibility of InfluxDB for Tracking Volumen and Floor for Collections #720

Closed kespinola closed 2 years ago

kespinola commented 2 years ago

Goal

Determine the feasibility of using Influxdb for determining voluming and floor prices for collections.

kespinola commented 2 years ago

There was no clear both in merging the results from influx with postgres at this time. For example, taking the top volume stats from influx and sorting collection results from postgres.

At this time we will not be pursuing influx.

@imabdulbasit not urgent but can you comment with ingestion methods and other key features you noted from influxdb? Paste any reference links. Thanks.

imabdulbasit commented 2 years ago

We have to execute expensive postgres queries to insert volume of a collection into influx db every time we get a new purchase or a cron job to insert all the collections volumes which is the reason why we were exploring influxdb option. If we want to insert all the purchases data into influxdb and then calculate volumes from influxdb instead of postgres we can but that wouldn't be very helpful the only benefit would be to execute query to calculate volume on influxdb.

Which is why I thought to use the SQL plugin and task scheduler both of which are builtin plugins in influxdb. The plan was to execute Flux script to execute Collection Volume SQL query on postgres and insert them into influxdb every 5-10 minutes. This way we can display volume/time or floor/time charts and also get the volume data that we need but the problem I found was that Flux script doesn't support insertion from the SQL.

Hence, I think that the good option would be just to create volume cache table and update data through a cron job.

imabdulbasit commented 2 years ago

If we need to chart volume/time or floor/time then the influxdb would be only option. We would need to insert the purchases/listings data into influxdb like we are doing for postgres. Influxdb has almost all the operators that postgres has so we can execute queries on purchases bucket to get volume/time chart. I can work on importing the data from postgres purchases table to influxdb and see how fast the queries load but the data needs to transformed to fit influxdb requirements