Open ghost opened 7 years ago
Thanks! And that's a great point. I'm leaning toward shifting waybackpack
's away from filesystem storage and toward database storage. (Relevant comment here.) With the latter, there would be no need to convert URLs into filepaths.
Re. the code, the relevant bit is this line, which uses Python's built-in os.path.split
and urlparse
methods to extract the filename: https://github.com/jsvine/waybackpack/blob/master/waybackpack/pack.py#L42
Currently if you try to save a resource with the template of:
www.site.com/news?385
It will save it as merely "news" instead of something like news@385 (like what wget does).
I looked through the code and couldn't find the part that is handling the url query, but if one is saving a large amounts of files in that format, it becomes less userfriendly to simply have a thousand files labeled "news".
Awesome program by the way.