Open pierre427 opened 10 years ago
If the purpose is to handle complex data-sources, then my answer would be that I intend for threatinator to be able to handle those complex situations. I've built the framework for it, it just needs implementing.
ZIP file contents, multizip files
I planned for handling of this sort of thing, already. Take a look at https://github.com/cikl/threatinator/blob/develop/lib/threatinator/io_wrappers/gzip.rb . This just decompresses a gzip stream. Implementing ZIP w/ file support won't be difficult. Note that the io_wrapper code hasn't been integrated into the feed definitions, yet, so it can't be leveraged at the moment.
XML content which is complex
I plan on implementing this as a parser that leverages xpath to find relevant nodes, and then present that node as a record to a parser block.
Database sources, and other interactive systems
Any examples that you can point to?
GIST multifetch
The only example that I know you have of this leaves me to wonder if those can really be considered legitimate feeds if their locations are changing so often.
Scraping websites
This can be handled just like XML, as HTML is a form of XML.
I would like the ability to leverage third party scripts from a feed, in order to handle "complex" datasources, such as:
It would be really nice to be able to specify something like: