filterbubbler / filterbubbler-web-ext

The FilterBubbler WebExtension turns your browser into a collaborative text classification lab.
37 stars 17 forks source link

Workshop: Library search suggestion system #40

Open dmarti opened 7 years ago

dmarti commented 7 years ago

When I (@schue) was visiting Stanford we discussed the idea of training a corpus with page content that was related to the various library search systems at the university. This would allow a user to have a search pull down that automatically recognized the topical area of a page that was being viewed and suggest the proper search system in response, perhaps even showing a live search dialog for the proper system.

mapninja commented 7 years ago

It would be cool if this worked as a "you might also be interested in" search term suggestion. So that if someone searches for "weather prediction history" it might pop a list of suggested other searches, such as "Lorenz attractor", "Butterfly effect", etc.... One of the biggest challenges I see for people looking for spatial datasets, in particular, is forming effective search terms. THey know what they are looking for (when they see it) but they don't necessarily know how to query for it. Another avenue that might be interesting, too, is to think of this as a way to recreate the serendipitous discovery of source material, digitally, in the same way we used to serendipitously discover material by noticing the books on the library shelf to the left and right of the book we were actually looking for.

dmarti commented 7 years ago

@mapninja As far as I know, Google does synonyms behind the scenes (applying synonyms to your search for you). It might be interesting to reveal where similar classifications appear -- for example, one user's FilterBubbler classified this document as "statistics" and another one classified it as "data science" -- so users who come to the library looking for "data science" might also want to look under "statistics".

schue commented 7 years ago

you would have to create a very large corpus with many classifications. each suggested search would essentially be a classification with sample document URLs for each general search.