neume-network / crawler

music NFT crawler
https://neume.network
GNU General Public License v3.0
12 stars 2 forks source link

Discussion about this alternative crawler #1

Closed il3ven closed 1 year ago

il3ven commented 1 year ago

I have created a proof of concept crawler to explore a better architecture for neume. Currently it only crawls sound-protocol.

Note: This crawler is not perfect and cuts corner but I hope we can incorporate some of these things into neume.

What are the changes?

Use an imperative programming paradigm

This means directly calling functions like callTokenuri and instead of posting extraction-worker messages we now use async/await to send messages. We are still using extraction-worker and getting the same benefits such as rate limiting and timeout.

We won't have to deal with worker threads, memory leaks in lifecycle and onboarding developers will be easier. The existing architecture has been proven a little difficult to explain.

We still have concurrency by using p-map.

https://github.com/il3ven/music-nft-crawler-poc/blob/3e35181da889387733d26b1c918f3eb726e67d6e/index.mjs#L85-L95

Use sqlite as the database

Flat-files fail us when we need to find data or update it. A database of some kind is necessary to neume. For this POC I have used sqlite as a key-value database where the key is an ID and the value is JSON data. In case, we need to share our data using IPFS we can because essentially it is just JSON.

Miscellaneous