Database suggestions - Githubissues

juls858 commented 1 year ago

I am an engineer at Flipside Crypto. We are investigating using this indexer, or the near lake indexer as a source of near data. The well modeled schema is perfect for the basis of our analytics needs. We are currently using a tool called HEVO similar to FiveTran to sync from Postgres to snowflake. However, the lack of auto-incrementing ids or any columns related to inserted/updated dates for each row make efficient syncing problematic. This is particularly true for the larger tables.

Suggestions:

add auto-incrementing ids to all tables or provide updated/inserted timestamps to all tables
add indexes to such columns
provide a public backup of the database so that it can be restored and the whole chain won't have to be reindexed (maybe in s3 similar to how near lake data is made publicly available).

khorolets commented 1 year ago

Hey there!

provide a public backup of the database so that it can be restored and the whole chain won't have to be reindexed (maybe in s3 similar to how near lake data is made publicly available).

I am sorry but we have no plans to provide publicly available postgres backups because of the size of the database. At least in the nearest future.

add auto-incrementing ids to all tables or provide updated/inserted timestamps to all tables add indexes to such columns

I would ask @telezhnaya and @pkudinov to answer here

telezhnaya commented 1 year ago

Hey @juls858 ! The blockchain nature allows us to use natural PKs for most of the tables. As the bonus, Postgres keeps the consistent state, ignores duplicates, etc. This is a crucial feature for us.

If you want such columns, I'm afraid we can only suggest you to fork the implementation and add the columns you may find useful. Keep in mind that you need to avoid duplicates somehow. Also, keep in mind that Snowflake does not support big integers which are required for storing amount values.

near / near-indexer-for-explorer

Database suggestions #324