stenskjaer / samewords

Automatically annotate potentially ambiguous words in critical text editions made with LaTeX and reledmac.
MIT License
7 stars 1 forks source link

handling of brackets #44

Open floriandk opened 5 years ago

floriandk commented 5 years ago

Rather pondering than suggesting...

Processing

\documentclass{article}
\usepackage[series={A},nofamiliar,noeledsec,noledgroup]{reledmac}

\begin{document}

\beginnumbering
\pstart
Some
\edtext{text}{%
    \Afootnote{words.}} 
and some more te[xt but partly un]readable.
This will cause \edtext{problems}{%
    \Afootnote{difficulties}} later.
\pend
\endnumbering 

\end{document}

will give

\documentclass{article}
\usepackage[series={A},nofamiliar,noeledsec,noledgroup]{reledmac}

\begin{document}

\beginnumbering
\pstart
Some
\edtext{\sameword[1]{text}}{%
    \Afootnote{words.}} 
and some more \sameword{te[xt} but partly un]readable.
This will cause \edtext{problems}{%
    \Afootnote{difficulties}} later.
\pend
\endnumbering 

\end{document}

This is the expected behaviour in line with that "[]" are considered punctuation.

There are two problems with this -- depending on the expected behaviour for numbering:

Do you have any advice/suggestions on how to handle this?

stenskjaer commented 4 years ago

Good points about the brackets. I will check if there is a better way to set the punctuation and I'll have to get back to you with some deep thoughts on what to do with the "some text [with brackets] inside text" later ;)

stenskjaer commented 4 years ago

Okay, I have given this a bit of thought.

I think it would be possible to have two propertiese in the settings: include_punctuation and exclude_punctuation or similar. The first adds the characters to the punctuation set, while the second removes them. In this way you could define what should be added and removed from the punctuation list.

If the editor wants to distinguish bestween "Julius" and "Juli[us]", then "[" and "]" can be removed from the punctuation set. If they should be identical, then \char"005B and the code for the opposite character could be added to the punctuation set.

The problem about \sameword{te[xt} not compiling, is there another way around that than using the character codes as suggested? Normally I would also consider making for example \secl{} and \add{} macros for adding these editorial characters as a macro, because then you can change what character you use in one place. But of course that give a wrong result because of the overlap with the sameword macro.

Maybe writing \sameword{te{[}xt} can compile in reledmac?