Given a word in a text being displayed, there should be a way to see a concordance of that word over a collection of text in the language. The concordance should have the following features:
It should display at least some minimum number of entries, if those entries exist.
Each line of output should show the word with at least a minimum window of surrounding words
The query word should be highlighted in some way for each appearance on each output line
The search should be case insensitive (one use of the concordance is to see whether a given word is usually capitalized)
The index should be over any specified set of text files, not just the ones currently being annotated
It would be delightful, but not crucial, to have the following features:
Search is available over phrases instead of just individual words
If more entries for the word than the minimum number are available, the user should be able to scroll/less through them
All occurrences of the search term in the output should appear in the same column (like in a published concordance)
The option to search over stems might be useful
Native Script vs Romanization
I lean toward operating over Romanized text, but that does create a bit of weirdness in the tool. Languages that are natively in Latin script are not put through uroman, so they don't have that row. Thus, the interface would be different for different languages. Perhaps we should index both?
Requirements
Given a word in a text being displayed, there should be a way to see a concordance of that word over a collection of text in the language. The concordance should have the following features:
It would be delightful, but not crucial, to have the following features:
Native Script vs Romanization
I lean toward operating over Romanized text, but that does create a bit of weirdness in the tool. Languages that are natively in Latin script are not put through uroman, so they don't have that row. Thus, the interface would be different for different languages. Perhaps we should index both?