brianleect / etherscan-labels

Full label data dump of top EVM chains in JSON/CSV.
MIT License
249 stars 73 forks source link

Need specific Python version with compatible selenium and pandas #13

Closed ewagmig closed 1 year ago

ewagmig commented 1 year ago

When I start this project with python 3.11, there is a lot of compatibility issues on selenuim and pandas. So could you specify the exact python version for this project, thanks.

brianleect commented 1 year ago

Hi matt, thanks for bringing up the issue.

Seems that in the recent months, Selenium had a major rework breaking one of the functionality which we use, fixing it now.

https://stackoverflow.com/questions/72773206/selenium-python-attributeerror-webdriver-object-has-no-attribute-find-el

ewagmig commented 1 year ago

Thanks @brianleect, yes I found that there is some update w/o backwards compatibility with Selenuim. By the way, is there an version with python main script updated according to the etherscan label cloud webpage?

brianleect commented 1 year ago

No update is required for label cloud webpage. I just tested the scraper and it still works fine for gathering labels for labelcloud. But we require fixing labels scraping as the way address is formatted just changed.

brianleect commented 1 year ago

@ewagmig latest version at main should have fixed it. I made some modifications to remove the need for manual webdriver installation as well

ewagmig commented 1 year ago

Thanks @brianleect, but all the labels in webpage is not collected all. e.g. https://etherscan.io/accounts/label/0x-protocol?subcatid=undefined, only the addresses in table id "main" are scraped, the addresses in other 2 table id have not been scraped at all. That is only 26/59 of all addresses in this situation. Some updating on the base url should be taken to this scenario.

brianleect commented 1 year ago

@ewagmig The link you sent does not seem to display any labels for me when I click into others or legacy?

image

I'm only able to view labels found in Main

ewagmig commented 1 year ago

Yes, I seems that the data provider, i.e. etherscan.io has just updated its display strategy recently. I think the link https://etherscan.io/accounts/label/0x-protocol?subcatid=3-0 with different subcatid input for the tables.

brianleect commented 1 year ago

@ewagmig Latest push should have fixed it. Uploaded latest labels as well.

brianleect commented 1 year ago

Closing as it should be resolved