culturehack / data-tool

A collection of cultural data sets and sources & a website to browse them.
MIT License
22 stars 16 forks source link

Ability to search datasets by title / description #15

Closed frankieroberto closed 11 years ago

frankieroberto commented 11 years ago

This would be pretty useful.

Currently unsure on approach.

Could import all the entries into postgres upon launch and use the postgres full text search feature. Has the advantage of built-in features like stemming and spelling correction. Disadvantages: another dependency, makes site more complicated to install, etc.

Alternative could implement some basic in-memory text searching. Wouldn't be too tricky to simply return matching results, but wouldn't be as sophisticated.

mildlydiverting commented 11 years ago

Which method did you implement in the end? What data fields does it search?

frankieroberto commented 11 years ago

@mildlydiverting it searches title and the main content, and uses a basic regex to find that word or phrase within those bits of content. There are a few limitations: e.g. if you search for a plural word it won't return results that only mention that in singular, but it is case sensitive, and runs pretty fast, so I think will do for now.

If there are any other fields we should search, it's relatively easy to add those though. Publisher perhaps?

frankieroberto commented 11 years ago

I'll close this for now, as the basics are done – let's start new issues for any refinements.