Closed andyl closed 10 years ago
Not sure what you mean exactly. I'm going to assume you'd like a representation of the object instead of just the id.
One simple idea is to store the representation you'd like in a hash:
representations = things.inject({}) do |result, thing|
result[thing.id] = thing
result
end
# Set up index and search here etc.
results = picky_search_interface.search ...
results.ids.map { |id| representations[id] } # => a list of things
Does that help?
P.S: Or if you like Ruby funkyness, replace the last line with
representations.values_at *results.ids
OK I'll store the object data in the ID - I think that will work - thanks for your reply.
@andyl Good luck! Was it hard getting to the point where you are now? Any suggestions for improvements – where did you get stuck?
Hi @floere - I built a working program using picky, and yes I did get stuck! Here's some of my experiences and ideas for improvement.
I've got a directory with a few hundred text files. Within each text file, there are HTML-like start/end tags that delimit text snippets organized in a week > day > timestamp hierarchy.
I want to be able to search for text snippets from within vim - using a search interface similar to that provided by ack.vim. My starting point was to write simple index & search programs that work in the command line.
I wrote a parser (using parslet) to extract the snippets, allowing me to generate a record for each snippet which contained the following categories: file_name, week_start, day, timestamp, title, text, start_line. For the ID, I constructed a text string "#{file_name}/#{start_line}/#{title}". These records were fed into Picky to generate the index. Then I wrote command-line script that performed a search and returned a list of ID's.
All I wanted was a little command-line script, but the docco has stuff about sinatra servers, web clients and javascript front ends mixed throughout.
Many of the critical docs were empty. For example, I wanted to save the index to a file, and the docco just says "TODO" (or some such). The only way I was able to get the app working was to read the source and introspect with PRY.
After a lot of head-banging, I looked for alternative Ruby search engines, and could find none. There's stuff based on Lucene, but that is overkill. IMHO Ruby needs a small / lightweight / simple search engine. I wanted a Ruby-embeddable search engine that was as simple to use as 'ack' or 'silver surfer'.
It looks like you put a ton of work building a great search engine, but for me, Picky could use a lot of simplification.
1) Remove all the sinatra server/client stuff - extract into a separate app.
2) Finish the docco.
3) Provide simple out-of-the-box command-line example apps for common scenario: searching CSV, JSON and XML files.
4) Do tests with new users. Benchmark against ack and silver-surfer. The new user should be able to go from standing start to working search in 3-5 minutes.
Well, that is probably more than you wanted. :-) I hope my notes were useful. Thanks for Picky !!
@andyl Wow! Thanks so much for your description – this is very, very helpful indeed :)
For now, just a quick question re 4) – I guess the simple example on the web page http://pickyrb.com/ did not help very much?
Hi @floere - re 4) - I was confused by the 'Got 5 minutes' and 'Got 2 minutes' paths. I chose incorrectly, and wasted a ton of time on the sinatra example. Suggestion: remove the 5 minute example.
The 2 minute code example was very helpful. Here's a a gist which I think would be even clearer:
Thanks a lot – I'll have to be even clearer about the choice there, perhaps a tab with Just Ruby/Sinatra and two examples. But for now I used your example: http://pickyrb.com/. Thanks a lot!
Very nice. I'll try to contribute another example script next week. (CSV search)
I've got a directory of text files, and I'd like to use Picky for a little command-line search app. I've progressed to the point where I can generate an index, do a search and get a list of IDs.
Once I have an ID, is there any way I can retrieve the categories for that object out of the index? I'd like to use the category info to populate the results list.