tomnomnom / waybackurls

Fetch all the URLs that the Wayback Machine knows about for a domain
3.43k stars 457 forks source link

Added a bunch of things #2

Closed anshumanbh closed 6 years ago

anshumanbh commented 6 years ago
tomnomnom commented 6 years ago

Hey @anshumanbh! Thanks for the PR :)

I'm afraid I can't merge it at the moment as it breaks my own workflow! Reading domains on stdin and outputting them on stdout was a very deliberate design choice on my part; it allows the tool to be easily used in pipelines; e.g.

cat domains | waybackurls | tee -a urls | grep robots.txt | concurl -c 20 -- -vk

I don't think the targetFile and result flags add anything over, for example, cat targetFile | waybackurls >> result, but they do increase code complexity, and remove the possibility of use cases like that above.

I also think changing the single-domain use-case from waybackurls <domain> to waybackurls -target <domain> only makes things harder for the user.

The last thing that makes me a little nervous (not that I don't trust you! :D) is having the README recommend running to the tool in a container based on an image that I, the maintainer, have no control over.

I think adding the retries is a worthwhile endeavour; I'd be happy to merge that portion :)

Sorry for the overly-critical response; I hope you're not too dismayed. I really do value contributions, but I can't bring myself to accept them when they depart from the original design goals of the tool.

anshumanbh commented 6 years ago

Yup, understandable. No worries!