Closed johncoleman83 closed 5 years ago
I can try to do something.
Thanks @edikxl let me know if you need help. The docs though for each script are pretty good
You can separate this into different PR’s. The first step is easy, the second step will require more work and you can either do that separately or leave it for someone else
OK
I'll take at the second step. :+1:
Hi @mrvnmchm, if you still want a crack at this, I just merged @edikxl 's updates and am going to clean it up a bit. And there is still room for some major reorganizing. The system works fine, just could be organized better and functionalized as we have been discussing.
@edikxl, feel free to add your name to the README.md and any usage details you think will help. I did add a large comment in the file you made though with somewhat of a usage info.
Thanks @johncoleman83, I was waiting for that merge.
Thanks @mrvnmchm, I started modularizing the main app, but stopped after the building module for the error check and storage write.
Completed building the modules, and forming the arguments and help. Working on execution and test. Here's the help so far:
(domainScraper-cQAT3a2w) mrvnmchm@M3-Q-X70-A:/some_folder/domain_scraper$ python domain_scraper.py -h
usage: domain_scraper [-h] [--check [CHECK]] [--extract [EXTRACT]]
[--scrape [SCRAPE]] [--scrape-n [SCRAPE_N]]
[--all [ALL]]
[input_file]
Scrapes domains from one input URL or from a file list of domains for broken
links, valid emails, and valid social media links.
positional arguments:
input_file Indicate the input file to scrape.
optional arguments:
-h, --help show this help message and exit
--check [CHECK] Find broken links from urls in file.
--extract [EXTRACT] Extract name from emails in file.
--scrape [SCRAPE] Scrape emails and social media urls from file.
--scrape-n [SCRAPE_N]
Scrape emails and social media urls from file with new
links.
--all [ALL] Perform all actions on urls from file with links.
Looks super nice, don't forget to pull latest master, I fixed a slight bug in the original entry point file in the latest commit.
@johncoleman83, PR #5 set for review. Thanks for letting me help, let me know if there is anything I need to change.
This was completed by @mrvnmchm.
All these scripts have shared functions. Can we create 1 entry point for all the scripts.