syvwlch / Data-Ignota

A data-driven exploration of Ada Palmer's Terra Ignota series
https://syvwlch.github.io/Data-Ignota/
MIT License
3 stars 0 forks source link

[Feature] Number the paragraphs #40

Closed syvwlch closed 2 years ago

syvwlch commented 2 years ago

Is your feature request related to a problem? Please describe.

The current data for lines of dialog does bot indicate the paragraph they belong to. This makes it impossible to tell if two consecutive lines from the same speaker were broken up by mere narration or by action/time elapsed.

Describe the solution you'd like

Number the paragraphs in the digital edition like the books & chapters and include.

Describe alternatives you've considered

Number during extract like for the lines themselves, but this makes them likely to change without warning.

Additional context

Likely best done using an XSTL transformation which only changes <p> and <sp> inside a chapter div.

syvwlch commented 2 years ago

Had a think, and procedurally slapping a paragraph number in the digital edition adds no robustness while cluttering it with metadata it does not need, since it track document order!

The numbering generated when the file is extracted is robust enough, as long as I don't change how it is generated, or more importantly, stay consistent between different files that refer to paragraphs by that number.

syvwlch commented 2 years ago

New plan is to iterate thru books, chapters, paragraphs, and then lines of dialogs within paragraphs, and use the iteration index for the last two levels which are not numbered in the Digital Edition.