Closed huangsam closed 5 years ago
New output:
Extract urls... Currently checking: file=19-nginx.markdown Check urls... Currently checking: id=3057 host=www.airport-parking-shop.co.uk Bad url status: { "http://www.codecademy.com/tracks/python": 404, "http://sircmpwn.github.io/2017/01/13/The-problem-with-Python-3.html": 404, "http://pythonforengineers.com/there-isnt-really-a-python-2-vs-python-3-problem/": 404, "http://ryanfrantz.com/posts/solving-monitoring/": 404, "http://hginit.com/": -1, "https://devup.co/a-look-at-devops-tools-landscape-7220099c6b81": -1, "https://blogs.msdn.microsoft.com/commandline/2018/06/20/windows-command-line-backgrounder/": 504, "https://github.com/rosarior/awesome-django": 404, "https://testdriven.io/part-one-flask-blueprints": 404, "https://javascript-minifier.com/": 504, "https://cssminifier.com/": 504, "https://cloud.google.com/appengine/docs/python/sms/twilio": 504, "http://www.django-rest-framework.org/topics/3.0-announcement/": 404, "http://www.diveintopython3.net/unit-testing.html": -1, "http://blog.fogcreek.com/working-effectively-with-unit-tests-interview-with-jay-fields/": 404, "https://bokeh.github.io/blog/2017/7/24/styling-bokeh/": 404, "https://bokeh.github.io/blog/2017/7/5/idiomatic_bokeh/": 404, "https://tomaugspurger.github.io/modern-1.html": 404, "http://pritishc.com/blog/2015/09/06/uploading-with-django-and-amazon-s3/": 404, "http://www.opsschool.org/en/latest/": 404, "https://devup.co/serverless-computing-if-there-is-no-server-where-does-my-application-run-a369c3699730": -1, "http://engineering.simondata.com/can-we-use-jenkins-for-that": 404, "https://linuxacademy.com/howtoguides/posts/show/topic/13750-managing-docker-containers-with-ansible": 404, "http://pritishc.com/blog/2015/09/03/docker-is-awesome/": 404, "http://pritishc.com/blog/2015/09/04/docker-is-awesome-part-ii/": 404, "http://stridercd.com/": -1 } Bad url locations: { "http://www.codecademy.com/tracks/python": [ "01-introduction/08-best-python-resources.markdown" ], "http://sircmpwn.github.io/2017/01/13/The-problem-with-Python-3.html": [ "01-introduction/04-python-2-or-3.markdown" ], "http://pythonforengineers.com/there-isnt-really-a-python-2-vs-python-3-problem/": [ "01-introduction/04-python-2-or-3.markdown" ], "http://ryanfrantz.com/posts/solving-monitoring/": [ "06-devops/01-monitoring.markdown" ], "http://hginit.com/": [ "02-development-environments/20-mercurial.markdown" ], "https://devup.co/a-look-at-devops-tools-landscape-7220099c6b81": [ "06-devops/00-devops.markdown" ], "https://blogs.msdn.microsoft.com/commandline/2018/06/20/windows-command-line-backgrounder/": [ "02-development-environments/10-powershell.markdown" ], "https://github.com/rosarior/awesome-django": [ "04-web-development/02-django.markdown" ], "https://testdriven.io/part-one-flask-blueprints": [ "04-web-development/03-flask.markdown" ], "https://javascript-minifier.com/": [ "04-web-development/19-minification.markdown" ], "https://cssminifier.com/": [ "04-web-development/19-minification.markdown" ], "https://cloud.google.com/appengine/docs/python/sms/twilio": [ "04-web-development/52-twilio.markdown" ], "http://www.django-rest-framework.org/topics/3.0-announcement/": [ "04-web-development/48-api-creation.markdown" ], "http://www.diveintopython3.net/unit-testing.html": [ "04-web-development/36-unit-testing.markdown" ], "http://blog.fogcreek.com/working-effectively-with-unit-tests-interview-with-jay-fields/": [ "04-web-development/36-unit-testing.markdown" ], "https://bokeh.github.io/blog/2017/7/24/styling-bokeh/": [ "03-data/19-bokeh.markdown" ], "https://bokeh.github.io/blog/2017/7/5/idiomatic_bokeh/": [ "03-data/19-bokeh.markdown" ], "https://tomaugspurger.github.io/modern-1.html": [ "03-data/16-pandas.markdown" ], "http://pritishc.com/blog/2015/09/06/uploading-with-django-and-amazon-s3/": [ "05-deployment/03-static-content.markdown" ], "http://www.opsschool.org/en/latest/": [ "05-deployment/13-operating-systems.markdown" ], "https://devup.co/serverless-computing-if-there-is-no-server-where-does-my-application-run-a369c3699730": [ "05-deployment/38-serverless.markdown" ], "http://engineering.simondata.com/can-we-use-jenkins-for-that": [ "05-deployment/28-jenkins.markdown" ], "https://linuxacademy.com/howtoguides/posts/show/topic/13750-managing-docker-containers-with-ansible": [ "05-deployment/33-ansible.markdown" ], "http://pritishc.com/blog/2015/09/03/docker-is-awesome/": [ "05-deployment/36-docker.markdown" ], "http://pritishc.com/blog/2015/09/04/docker-is-awesome-part-ii/": [ "05-deployment/36-docker.markdown" ], "http://stridercd.com/": [ "05-deployment/27-continuous-integration.markdown" ] }
It intends to provide correlation between bad urls and the content that contains those bad urls. Since a bad url can exist in multiple files, the locations output is shown in terms of a Python list instead of a Python string.
awesome thank you @huangsam! lemme get rid of these bad urls as well
New output:
It intends to provide correlation between bad urls and the content that contains those bad urls. Since a bad url can exist in multiple files, the locations output is shown in terms of a Python list instead of a Python string.