The book_n column of tei2csv output comes from the @n attribute of div1 elements. In works that don't have separate books, @n in the TEI has some kind of placeholder rather than a real number:
It would be better to output a blank in the book_n column in these cases, clearly indicating a work that has line numbers but not book numbers.
I plan to try the heuristic of attempting to parse @n as an integer; then if that does not work, set book_n = None. Raise an error if div1@n attributes are not unique within a work (counting None as a distinct value).
The
book_n
column of tei2csv output comes from the@n
attribute ofdiv1
elements. In works that don't have separate books,@n
in the TEI has some kind of placeholder rather than a real number:The essentially useless book number is copied to the output:
It would be better to output a blank in the
book_n
column in these cases, clearly indicating a work that has line numbers but not book numbers.I plan to try the heuristic of attempting to parse
@n
as an integer; then if that does not work, setbook_n = None
. Raise an error ifdiv1
@n
attributes are not unique within a work (countingNone
as a distinct value).