DistriNet / tranco-python-package

Python package to access the Tranco list
MIT License
21 stars 9 forks source link

def rank(self, domain) implementation is pretty slow #7

Closed jconwell closed 12 months ago

jconwell commented 2 years ago

the rank() function uses list.index() to get the rank of any passed in domain. This causes a linear scan through the list until if finds the domain, which can take a long time if used to look up ranks for a large set of domains.

Instead you might build a dict[domain:rank] to hold the domain ranks. This way each rank lookup is a simple hashtable lookup.

If you want I can create a PR for you to do this.

VictorLeP commented 2 years ago

If you want I can create a PR for you to do this.

That would be great, thanks!

jconwell commented 2 years ago

I sent the PR. Hope this helps

VictorLeP commented 12 months ago

Improved in 675a86d