brianleect / etherscan-labels

Full label data dump of top EVM chains in JSON/CSV.
MIT License
249 stars 73 forks source link

merge and update for etherscan #38

Closed c0mm4nd closed 11 months ago

c0mm4nd commented 1 year ago

Some changes:

Some havent upload:

brianleect commented 1 year ago

Thanks for the improvements @c0mm4nd

There's an ongoing fix https://github.com/brianleect/etherscan-labels/pull/37 that I'm working on that fixes the label truncation for etherscan token names as mentioned in https://github.com/brianleect/etherscan-labels/issues/34 .

So I think we will go with the merged data from the PR I did, I'm currently rescraping due to a small bug and I'll merge it soon.

As for your changes, I think the design/formatting fixes look good, but I'm not too sure with regards of unifying next page behavior using '>' , as I think incrementing by index seems more flexible assuming possible style changes in the site?

And cookie saving would definitely be great if we can figure it out, sadly I've not really found a solution for it, maybe I'll try giving it another shot when I've more time as well.

c0mm4nd commented 1 year ago

Reason for choosing clicking > button:

  1. start=100 is not working on etherescan token page
  2. click > will change the url in javascript from start=0 to start=100, which means it is doing the same thing as the index incrementing
  3. less global refresh, significantly bypassing the detection of cloudflare

KNOWN BUG: It works for most label pages, but for few, like beacon-depositor (due to the large size), the response of which always delayed. the content will may be duplicated. SOLUTIONS:

c0mm4nd commented 11 months ago

https://metadata.etherscan.io/api-endpoint/address-metadata

Close PR since it looks like this API can provide better results.