Text Extraction via URL

sroberts / jager

Hunting IOCs all day every day...

Other

82 stars 19 forks source link

Text Extraction via URL #35

Closed sroberts closed 9 years ago

sroberts commented 9 years ago

It would be ideal to be able to give a URL to Jager and have it either download a file (in the case of PDF) or simple grab the html and parse it (not sure if it should still download or not) in the case of an HTML page.

$ jager http://example.com/omgaptz0rz.html

kbandla commented 9 years ago

However, you would have to do this this way:

jager -u http://example.com/omgaptz0rz.html -o /tmp/

If you are happy with that, please close this issue

sroberts commented 9 years ago

@kbandla First of all #58 is epic. I'm cool with this, the only think I want to verify is that it will handle it if you point it to a remote PDF?

kbandla commented 9 years ago

:grin: it does. It checks the content-type and acts accordingly. Currently, there are only three handlers: pdf/json/others

sroberts commented 9 years ago

@kbandla Thats true, but I have some other ideas for that ultimately. Just waiting for all your changes to land before I get that started.

kbandla commented 9 years ago

all my changes are in. give it a try

sroberts commented 9 years ago

:metal: