Open jadudm opened 4 days ago
I have references in the wiki. The goal here is to keep both the stopword-cleaned text (possibly useful for searching?), and link it via id
to the original text. It will change extract
and the underlying SQLite table design, but otherwise it is of minimal impact.
At a glance
In order to see what I'm searching for as a user I want the actual text to be presented in search results
Acceptance Criteria
We use DRY behavior-driven development wherever possible.
Shepherd
Background
The prototype for
jemison
throws all content into asite_index
table using SQLite's FTS5. This... does the job, but before throwing the content in, we do stopword removal. As a result, the content being indexed is not actually what is on the websites we're crawling.So, in order to present actual/meaningful results, we're going to have to store the original content alongside the indexed content. The FTS5 searches will occur over the index, but we'll have to link back to the original, and present that as part of search results.
references
Some references that might be inspirational are in the wiki.
Security Considerations
Required per CM-4.
None, although at some point we have to do some filtering on fowl language. :duck:
Process checklist
- [ ] Has a clear story statement - [ ] Can reasonably be done in a few days (otherwise, split this up!) - [ ] Shepherds have been identified - [ ] UX youexes all the things - [ ] Design designs all the things - [ ] Engineering engineers all the things - [ ] Meets acceptance criteria - [ ] Meets [QASP conditions](https://derisking-guide.18f.gov/qasp/) - [ ] Presented in a review - [ ] Includes screenshots or references to artifacts - [ ] Tagged with the sprint where it was finished - [ ] Archived ### If there's UI... - [ ] Screen reader - Listen to the experience with a screen reader extension, ensure the information presented in order - [ ] Keyboard navigation - Run through acceptance criteria with keyboard tabs, ensure it works. - [ ] Text scaling - Adjust viewport to 1280 pixels wide and zoom to 200%, ensure everything renders as expected. Document 400% zoom issues with USWDS if appropriate.