heywoodlh / urlscan-py

Python wrapper for urlscan.io's API
Apache License 2.0
103 stars 36 forks source link

Add Output to .txt file #3

Closed ZeroDot1 closed 6 years ago

ZeroDot1 commented 6 years ago

If multiple URLs are scanned at the same time, there is the problem that you cannot always see all content in the terminal when a certain number of lines is exceeded. You could solve the problem by writing the output of the terminal to a file at the same time. ExampleCommand:

$ python urlscan.py scan -f '/home/user/s/C/urls.txt' -txt

ExampleFile:

{
  "message": "Submission successful",
  "uuid": "af4b3c59-6ad0-4414-b1b4-b29825bbfef5",
  "result": "https://urlscan.io/result/af4b3c59-6ad0-4414-b1b4-b29825bbfef5/",
  "api": "https://urlscan.io/api/v1/result/af4b3c59-6ad0-4414-b1b4-b29825bbfef5/",
  "visibility": "private",
  "options": {
    "useragent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36"
  }
}
{
  "message": "Submission successful",
  "uuid": "62b7e262-18e7-4d92-8e40-061b71a794f1",
  "result": "https://urlscan.io/result/62b7e262-18e7-4d92-8e40-061b71a794f1/",
  "api": "https://urlscan.io/api/v1/result/62b7e262-18e7-4d92-8e40-061b71a794f1/",
  "visibility": "private",
  "options": {
    "useragent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36"
  }
}
heywoodlh commented 6 years ago

I think that this might not be a feature that I want to implement as it can be solved by redirection or piping in the shell.

I'll show some examples. Not sure how comfortable you are with the shell/terminal so hopefully this doesn't come off as condescending.

So as an example I can pipe the commands to the less command:

python urlscan.py -f urls.txt | less

Once all the results are complete you can scroll through the results once they are all printed out (for about 5 domains it took about 15-30 seconds before I could scroll through it).

Another example would be to redirect the output to a file: python urlscan.py scan -f urls.txt >> saved_scans.txt

When the command is complete, a file named 'saved_scans.txt' will store all the scan results that you can review.

Another feature that I have built into the program is to search for scan results of domains. Each initiated scan and its resulting UUID and corresponding date that the scan was initiated. So if you have initiated a scan for 'google.com' you can retrieve all of the UUID's of that domain by using this command:

python urlscan.py search --host google.com

Here is an example:

❯ python urlscan.py search --host google.com
('google.com', 'a3dd1398-4e7a-4f23-8afa-06f23e84591f', '2018-02-26 19:28:16')

Hopefully this solves your problem. Again, I haven't really built this into the command as it can be solved through the terminal.

heywoodlh commented 6 years ago

The most practical solution is to use piping or redirection like the above comment shows. However, you could also script it in BASH to print the UUIDs stored in the database for the URLs stored in the file using urlscan.py's search function:

while read d;
do
        python urlscan.py search --host "$d"
done <domains.txt
heywoodlh commented 6 years ago

@ZeroDot1 did this fix your problem?

ZeroDot1 commented 6 years ago

This works very good. python urlscan.py scan -f urls.txt >> saved_scans.txt

ZeroDot1 commented 6 years ago

Thank you very much for your work, the program is really great.

heywoodlh commented 6 years ago

Thank you very much! I will close this issue then. Please keep them coming, I appreciate the feedback!