DistriNet / tranco-python-package

Python package to access the Tranco list
MIT License
20 stars 9 forks source link

list() in tanco.py returns empty list since 2019-07-17 #2

Closed stevenyu530 closed 5 years ago

stevenyu530 commented 5 years ago
from tranco import Tranco
t = Tranco(cache=True, cache_dir='.tranco')
latest_list = t.list()

latest_list in the above code returns empty list ([]). It has happened since 2019-07-17. Tried with the following dates. Both returned empty list. date_list = t.list(date='2019-07-17') date_list = t.list(date='2019-07-18')

As of date 2019-07-18, file ID 3NLL, tranco_3NLL.csv is empty, whereas top-1m.csv on the same date is normal.

Has file format changed recently?

VictorLeP commented 5 years ago

Thanks for the report! The list generation process crashed on July 17, meaning the two lists you refer to were not generated at the time. I've restarted the process and the lists are now available (with new IDs however). I've also made additional changes to reduce the likelihood of this occurring again in the future.

stevenyu530 commented 5 years ago

Thanks @VictorLeP !

Can I ask, what is the difference between daily file with file ID and the top-1m.csv.zip downloaded from URL https://tranco-list.eu/top-1m.csv.zip ?

Asking because I noticed that despited file for 2019-07-18 is empty, top-1m.csv.zip on the day is still working.

Is the top-1m.csv.zip the most recent successful output ?

Thanks Steven

VictorLeP commented 5 years ago

top-1m.csv.zip is indeed the most recent successfully generated list, and should therefore be the same as the daily file with its ID unless issues like this arise. You can check when top-1m.csv.zip was generated through the Last-Modified HTTP header.

stevenyu530 commented 5 years ago

Thanks @VictorLeP, you list is very helpful!

About the list service, do you intend to keep generating the list in the long term ? Also about the python package, how long are planning to maintain the pacakge ?

Thanks

VictorLeP commented 5 years ago

There are currently certainly no plans to stop generating/maintaining it, hard to tell what the future will bring of course :) But we certainly envisage the list being generated in the long term. The source code behind the list generation is publicly available though (https://github.com/DistriNet/tranco-list), so even in absence of the online service, it's still possible to generate it yourself.