Open nickumia-reisys opened 1 year ago
In our catalog schema.xml
, there are only two types tokenizers referenced:
(For Catalog) When Solr creates a search engine, it has three types of INDEXING methods:
(For Catalog) When a query is sent to each of these Solr engines, it applies the following to the search terms:
The other important part of the search definition is the q.op = "AND"
line.
This is just highlighting the parts that effect us and should guide further exploration into how searching works for catalog.
We do need to work on this one but moving this to icebox for now
User Story
In order to provide help to users, the Data.gov Search Team wants to understand how the search engine dissects search terms to return relevant results (i.e. why does searching for
NGDAID72
return the additive results ofNGDAID
and72
vs. why"NGDAID72"
return the single result?)Acceptance Criteria
[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]
Background
Came out of discussions with Census. There are a number of weird things that happen while searching, this is supposed to start documenting at some some of the major ones.
Security Considerations (required)
While doing this research, (although HIGHLY UNLIKELY) vulnerabilities in Solr relating to data integrity might arise.
Sketch
(Someone with more vision, please update this haha..)