hartator / wayback-machine-downloader

Download an entire website from the Wayback Machine.
Other
5.36k stars 713 forks source link

index.html ignored #282

Open gbizzotto opened 8 months ago

gbizzotto commented 8 months ago

Some index.html were ignored by the tool, probably because for some reason they are not listed in archive.org's URLs under that prefix.

Example:

There is an index.html here https://web.archive.org/web/20160909165403/http://cpp.drgibbs.com.br/cursos/1-c-para-iniciantes/02-funcoes

But, URLs reported by https://web.archive.org/web/*/http://cpp.drgibbs.com.br/cursos/1-c-para-iniciantes/02-funcoes*