mertakdut / EpubParser

Parses .epub files, provides seperation page by page.
Apache License 2.0
118 stars 27 forks source link

Problem with some epub. #8

Closed DanTsk closed 7 years ago

DanTsk commented 8 years ago

Got some problem with some epubs. Can't read content. Always get only 2-3 words from book. How I can solve it ? (Exmple of such epub)The_Swift_Programming_Language.zip

mertakdut commented 8 years ago

Hi,

Are you using the latest (currently 1.0.81) version?

I've tried reading the ebook with EpubParser-Sample-Android-Application, didn't notice any issue.

Please explain the issue in detail.

DanTsk commented 8 years ago

Yes, I am using the latest version 1.0.81 When I called readSection(); It returns me only first section, that contains few words not more.
Here my code ` try { readerFile.setFullContent(path); readerFile.setCssStatus(CssStatus.OMIT); readerFile.setIsIncludingTextContent(true); } catch (ReadingException e) {

        e.printStackTrace();

     } 

int pages =0;
BookSection bookSection = null; while (true) { try { bookSection = readerFile.readSection(pages);

            builder.append(bookSection.getSectionContent());

            pages++;
        } catch (ReadingException e) {
            e.printStackTrace();
        } catch (OutOfPagesException e) {
           e.printStackTrace();
            bookSection=null;
            break;
        }

}

`

(Sorry for such bad code, some problems with git)

And also I got such error dd

DanTsk commented 8 years ago

And if I System.out.println section, I got this

<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" xmlns:epub="http://www.idpf.org/2007/ops"> <head> <link href="epub.css" media="all" rel="stylesheet" type="text/css" /> <script type="text/javascript" src="svg.js"></script> </head> <body id="conceptual_flow_with_tasks"> <div class="content-wrapper"> <div id="chapter_container" class='part'> <article class="chapter"> <a id="TP40014097-CH1">&#x200c;</a><a id="TP40014097-CH1-XID_28">&#x200c;</a> <h2 class="chapter-name">Welcome to Swift</h2> </article> </div> </div> </body> </html>

DanTsk commented 8 years ago

Also your sample app got some problems ddd

mertakdut commented 8 years ago

Could you please check the same file by adding maxContentPerSection value as well?

readerFile.setMaxContentPerSection(integerHere);

If there is no issue, it's probably because I didn't make enough test without setting the value.

The main point of the library is to force-trim the epub files. But I'll try to solve the problems as soon as possible.

mertakdut commented 8 years ago

Just tested the file you sent me without setting maxContentPerSection value. It didn't give me any issue.

first three pages of swift.zip

mertakdut commented 8 years ago

Still having issues?

mertakdut commented 8 years ago

Did you proceed on the issue?