The option Open selection in dictionary is very useful for me because I can quickly search words on a dictionary by pressing on a word. I have noticed that the selection also captures the character — (EM DASH, U+2014) which is not very useful in those e-books where — (EM DASH, U+2014) is used inbetween words. The reason is that the selection searches <word1>—<word2> in the dictionary which yields zero results. Let me provide more context:
In the e-books that I am reading, the authors mostly use the character - (HYPHEN-MINUS, U+002d) in hyphenated words, for example in the words "self-confident", "barrier-free" or "built-in". When I press on a hyphenated word, Librera FD is able to capture the entire hyphenated word. See video below. This is desired behavior because the whole hyphenated word is searched on the dictionary.
In the books that I am reading, the authors mostly use the — (EM DASH, U+2014) inbetween words to add extra information, for example in: "… in other sources—next chapter—you will find …" or "… in a wider view—inside and outside—to be able to …" If I press on a word that appears before — (EM DASH, U+2014), then the selection includes the word and the word after — (EM DASH, U+2014). This is undesired behavior because I just wanted to search the word before — (EM DASH, U+2014) in the dictionary.
I believe that one way to solve this problem is to add a configuration that allow the user to define a regex that define the characters that should be captured by the selection. Because I want to capture entire hyphenated words, I would set the regex to [a-zA-Z\-]. Users that want to capture — (EM DASH, U+2014) could set the regular expression to [a-zA-Z\-—]. The image below shows a mock-up of this idea:
For the record, here's the PDF document that I used in the videos:
main.pdf I generated that PDF by compiling the following LaTeX document with pdflatex.
\documentclass{article}
\begin{document}
foo self-confident bar
foo barrier-free bar
foo built-in bar
... in other sources—next chapter—you will find ...
... in a wider view—inside and outside—to be able to ...
\end{document}
The option
Open selection in dictionary
is very useful for me because I can quickly search words on a dictionary by pressing on a word. I have noticed that the selection also captures the character — (EM DASH, U+2014) which is not very useful in those e-books where — (EM DASH, U+2014) is used inbetween words. The reason is that the selection searches<word1>—<word2>
in the dictionary which yields zero results. Let me provide more context:In the e-books that I am reading, the authors mostly use the character - (HYPHEN-MINUS, U+002d) in hyphenated words, for example in the words "self-confident", "barrier-free" or "built-in". When I press on a hyphenated word, Librera FD is able to capture the entire hyphenated word. See video below. This is desired behavior because the whole hyphenated word is searched on the dictionary.
2024-10-27-hyphen-output.webm
In the books that I am reading, the authors mostly use the — (EM DASH, U+2014) inbetween words to add extra information, for example in: "… in other sources—next chapter—you will find …" or "… in a wider view—inside and outside—to be able to …" If I press on a word that appears before — (EM DASH, U+2014), then the selection includes the word and the word after — (EM DASH, U+2014). This is undesired behavior because I just wanted to search the word before — (EM DASH, U+2014) in the dictionary.
2024-10-27-em-dash-output.webm
I believe that one way to solve this problem is to add a configuration that allow the user to define a regex that define the characters that should be captured by the selection. Because I want to capture entire hyphenated words, I would set the regex to
[a-zA-Z\-]
. Users that want to capture — (EM DASH, U+2014) could set the regular expression to[a-zA-Z\-—]
. The image below shows a mock-up of this idea:For the record, here's the PDF document that I used in the videos: main.pdf I generated that PDF by compiling the following LaTeX document with
pdflatex
.