r4victor / syncabook

📖🎧 A tool for creating ebooks with synchronized text and audio (EPUB3 with Media Overlays)
MIT License
274 stars 27 forks source link

ValueError: The text file has no fragments #17

Closed Audun97 closed 1 year ago

Audun97 commented 1 year ago

Hello, I would be very grateful if someone could help. I do not understand the error occurring after having run the sync command.

image

I have the xhtml and matching audio in separate folders as described. I just opened an epub as a zip file and extracted the xhtml files. That is okey right?

Best Regards,

Audun

r4victor commented 1 year ago

@Audun97, syncabook looks for XHTML tags with attributes id="f[0-9]+" and uses them as the units of synchronization. Here's an example: https://github.com/r4victor/afaligner/blob/master/tests/resources/shakespeare/text_complete/p001.xhtml

You can make XHTML files with such id tags from plaintext with the to_xhtml command. If try your own XHTML files, they won't have the proper attributes, so syncabook won't work.

The solution would be to start with plaintext and use to_xhtml, or to modify your extracted XHTMLs to contain tags with proper attributes.

I clarified this in the readme.

Audun97 commented 1 year ago

@r4victor Ahh thank you. I must have misunderstood. When I convert an xhtml to text I lose formatting. Is there no way to do it without losing formatting?

r4victor commented 1 year ago

As I've said, in this case the option is to modify the existing xhtml to add the id attributes. This will require some programming, of course.

It may be a good idea to have a command that preprocesses existing xhtmls to automate this. For example, it may just add id attributes to all p tags that are already there, or add span tags inside existing p tags, something of that sort. I may add this some day.

Audun97 commented 1 year ago

Cool thanks :) Would be great for people who are not too good at programming