When combining the text read from individual book elements of an epub file, those elements are currently separated only by an '\n' character.
I suggest separating them by a '\f' character instead. This would be analogous to current text extraction from PDF files, where the "logical elements" "individual pages" are also separated by a Form Feed.
This would help to maintain at least some kind of structure of the original file in the resulting txt file and thus make parsing the logical structure possible.
When combining the text read from individual book elements of an epub file, those elements are currently separated only by an '\n' character.
I suggest separating them by a '\f' character instead. This would be analogous to current text extraction from PDF files, where the "logical elements" "individual pages" are also separated by a Form Feed.
This would help to maintain at least some kind of structure of the original file in the resulting txt file and thus make parsing the logical structure possible.