Open ianshefferman opened 10 years ago
Makes sense. For question 1, that's not currently possible but I'll add it to the feature request list. Currently each search is essentially one query.
For your second question, there is very rudimentary support now... If you hit /results.json or /results/search.json it will return the results in json format instead of the entire web page. Similarly /results/RESULT_ID.json will give the same information for a single result. Additionally you should be able to do something like: /results/search.json?saved_filter_id=ID to get the results from an existing saved filter.
If you have any specific recommendations regarding what would be useful from an API perspective let me know. If you'd rather have some type of export at the end of a search let me know and I can think of some ways you might be able to easily do that...
Thanks for the response.
For bulk queries, I know adding those would kind of break the current model. There are probably a few different ways they could be structured. Do you have more than one active developer working on this project at the moment, and do you know if you could add a feature like this in the near future?
If not, I could try my hand at adding that feature and submitting a pull request. How I would probably do it: add multi
as a boolean field for Search
, and break Search#perform_search
into 2 separate methods, performing a new search for each line in the query if it's a multi search. The newline character could be converted to some other delimiter in the database, maybe |
. Might also be useful to add an extra multi_index
integer field for Result
so you can see exactly which query is tied to that result for multi searches, and access it by something like search.query.split(delimiter)[multi_index]
.
The "cleaner" way of doing it would be to just create one new search object for every line in the query, but that would pollute the searches page with a ton of rows, unfortunately.
For getting data out, the JSON endpoints for each page mostly satisfies our requirements.
What exactly is your motivation for having multiple queries under one search? I thinking that a simpler way to accomplish most of what you've mentioned is:
The other way I could imagine this being done "easily", although not necessarily the most elegant solution, would be overloading the search provider(s) and having the search provider itself split the query and run multiple searches.
What you suggested is definitely viable and will probably work well for what we want as long as the search page allows easy filtering. Could also automatically add a default "bulk" tag to any bulk search that is made.
It's not necessary for us to have multiple queries under one search, I was just concerned about viewing and editing searches being a mess after doing a bulk search that may look for 50 or more keywords. An enhanced search page would do the job. I also recommend showing the search tags next to each search on the search list page.
Sounds doable, thanks!
Figure I'd add my two cents into this issues rather than adding another as it falls in line with the "bulk" search option. My use (security company with clients) is likely atypical from a user tracking a particular organization. Ideally, I'd like to have a few different lists of items as search criteria, and only trigger an alert if some degree of union is met. For example: Company/Acronym list: Bank of Madeup, BoMu Netflix, NFLX Amalgamated Widgets Inc., AM, AMI etc
"Attack" keyword list: DDoS, hack, own, owned, pwn, pwned, loic, deface, defaced, etc
Attack Group: YourAnonGlobal, YourAnonNews, UG, TheRedHack, etc
The unions that I'd care about would be: Company + Keyword, Company + Group, and Company + Keyword + Group, with the last one being of highest confidence.
We're achieving this right now by following the various twitter accounts and scraping the tweets with in-house code, but it's not very pretty.
Hi @AtJofo,
I think this is a somewhat unique use-case. It would probably be relatively easy to build a custom search provider to do this though: https://github.com/Netflix/Scumblr/wiki/Extending-Scumblr#search-providers
Andy
I'd second this use case. Would be very useful to be able to see unions of hits from various keyword lists.
Hi, my team is interested in using Scumblr, but we have a few questions:
Depending on if my team agrees to go forward with this, I could possibly try and add that functionality myself and submit a pull request.
Basically, we'd like to use Scumblr more for the standardized data scraping and searching aspect, and then offload the results to a more powerful analytics platform (like Splunk) for further research.
Thanks.