ganlirong / iphoneebooks

Automatically exported from code.google.com/p/iphoneebooks
GNU General Public License v2.0
0 stars 0 forks source link

Support PalmOS file types (PalmDOC, Plucker, etc.) #21

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
It would be very convenient if PalmOS type ebooks were supported natively
by Books.app.  

While it's possible to convert some of the book types back to HTML or text,
there's a (relatively) large body of websites that offer the Palm formats
for download.  Native readers are available for Palm (of course), WinCE,
Blackberry, and various desktop OS's.  It really is the closest thing to a
"universal" ebook format, aside perhaps from HTML and plain text.

Several distinct formats are found for the Palm, all unfortunately sharing
the same PDB file extension used for all PalmOS data files.  The oldest and
most basic is the simple PalmDOC.  Other common forms include the open
source Plucker and the proprietary iSilo format.

Decode functionality for PalmDOC and Plucker formats is already available
in GPL-2 licensed code.  Integrating this code should be straight forward.
 Barriers to fully supporting these formats include the fact that both
support embedded images (not yet supported by Books.app's display code),
and that the Plucker format is an archive file which can contain multiple
distinct pages linked together with HTML href's.  A stopgap method of
supporting the formats would be to concatenate all pages and strip images
until Books.app can provide the necessary support.

Note that this issue is half request for feature, and half intent-to-code.
 I've only just got my toolchain setup, but I intend to work towards
implementing these file formats starting immediately.

Original issue reported on code.google.com by pendorbo...@gmail.com on 12 Sep 2007 at 8:30

GoogleCodeExporter commented 9 years ago
Hi there:

Support for DOC files has always been in my game plan, but believe me, I 
welcome the help! :)  I was planning on 
adapting txt2pdbdoc by Paul J. Lucas.

Can one "unzip" a Plucker archive to disk?  If so, it should be trivial to 
simply treat the unarchived file like any 
other folder.

Proprietary formats like iSilo and eReader (even the non-DRMed versions) are 
unlikely to be supported.

Original comment by roosters...@gmail.com on 13 Sep 2007 at 2:25

GoogleCodeExporter commented 9 years ago
There's some code in the Plucker project that will indeed dump a Plucker 
archive to
disk as individual HTML files plus images.  I've started adapting that code to 
just
concatenate the pages into one document for now and grafting support for it into
EBookView-loadBookWithPath:numCharacters:didLoadAll: with an extra elseif for 
the pdb
file extension.  

I want to try to avoid writing temp files if possible, but abstracting the 
entire
archive as a folder and letting Books display the separate files that way isn't 
a bad
idea.  To fully support Plucker's idiom will still require hyperlinks between 
the
separate files, but that's hardly critical for a first attempt at it.

The Plucker code should also deal with PalmDOC "for free," and it has code to 
convert
the Palm-specific image formats to JPEG as well.  That code requires libjpeg 
(which
I've managed to cross-compile), but I wonder if there's any chance of getting 
the
required functionality using Apple's private AppleJPEG framework or some other
method.  Again, I'll throw that in the "working first, optimized later" 
category.

Agreed on the likelihood of proprietary formats.  I do wonder what your 
feelings on
supporting them are if they can be reverse engineered?  I'm not much of a purist
myself when it comes to that sort of thing.  Assuming I can discover the 
formats and
don't have to violate copyright on someone else's code to do it, I wouldn't mind
putting support in.  

Now if I can just get a working toolchain...  I've got my changes compiling
successfully, but they just insta-crash when run on the phone.  The UIKit 
HelloWorld
does the same, so something's off with my compiler no doubt.

Original comment by pendorbo...@gmail.com on 13 Sep 2007 at 3:30

GoogleCodeExporter commented 9 years ago
Have you got the armfp.dylib library installed on the phone?  I've had issues 
with that before.

Feel free to send me a diff from SVN to zach AT brewstergeisz DOT cjb DOT net, 
and I can see if I can get it to 
work over here.

I wouldn't have a problem with reverge-engineering unencrypted formats.  But if 
it's "secure," i.e. unlockable with 
a credit card number, I don't want to have anything to do with it.  Of course, 
you're welcome to fork the project if 
you figure that out. :)

Original comment by roosters...@gmail.com on 13 Sep 2007 at 4:10

GoogleCodeExporter commented 9 years ago
Another, somewhat related idea just occurred to me.  Assuming we abstract the 
Plucker
archive and treat it as a directory full of files internally, why not do the 
same
with other archive formats?  

Given space limitations on the iPhone, uploading a ZIP full of HTML files would 
most
definitely burn up less flash than uploading the uncompressed files 
individually.

Original comment by pendorbo...@gmail.com on 13 Sep 2007 at 4:18

GoogleCodeExporter commented 9 years ago
Success!  I finally got my toolchain straightened out.  Looks like my issue was 
a bug
(issue #10 over on the toolchain google code project).  I actually managed to 
load
and read a Plucker PDB on the first try.  There's some weird characters here & 
there
(charset conversion perhaps?), and it's probably leaking like a sieve, but not 
bad
for a first try!

I should have a good block of time to hack this weekend, so hopefully I'll get 
it
cleaned up and submit a patch fairly soon.

Original comment by pendorbo...@gmail.com on 13 Sep 2007 at 11:09

GoogleCodeExporter commented 9 years ago
Excellent news!

The more I think about it, the more your idea of reading a zip archive directly 
makes sense.  I wonder if 
someone's already written an objective-C "glue" for zlib?  If not, that should 
be fairly trivial.  The difficult part 
would be adjusting the FileBrowser to read individual files within the zip, and 
letting NSString initialize itself with 
the archived data.  Still, it should be possible...

Original comment by roosters...@gmail.com on 14 Sep 2007 at 11:27

GoogleCodeExporter commented 9 years ago
Hello,
First of all, thanks for the great program!  I just wanted to say I'm 
incredibly 
excited about future support for PDB files.  Looking forward to the update!

  Warner

Original comment by sko...@gmail.com on 14 Sep 2007 at 11:40

GoogleCodeExporter commented 9 years ago
I've got a preliminary version of this code ready to go.  The attached tarball
contains a number of new files, full versions of the existing files I've 
modified,
and the output of `svn diff` for those modified files.  This compiles with the
toolchain and headers that they're currently calling 0.30.  I had to modify a 
few
Books files to compile under these headers, so you may not want to merge those.

New:
 * libjpeg.a - compiled version of libjpeg, statically linked for now.  This should
probably be dynamic at some point.

 * palm/* - new files which implement decoding for PalmDOC and Plucker file formats.
 All contributed code was licensed under GPL2.  The Plucker code makes my eyes bleed,
but it appears to work.  Cleanup should be forthcoming...

Modified:
 * EBookView.m - Added check for PDB extension which triggers new code in palm
directory.  Refactored txt->html conversion to allow PalmDOC format files to 
use the
same code, coming in as a string instead of a file.  Added additional txt->html
conversions to create a new paragraph for newline-space-space and newline-tab. 
Changed "struct CGRect clicked" to "CGPoint clicked" to compile under 0.30 
headers.

 * Makefile - additional source files compiled, added -lz to LDFLAGS, added -03 to
CFLAGS, added strip to package target.

 * FileTable.m - Same CGRect->CGPoint fix as EBookView.m

 * BooksApp.m - Added 'pdb' to file filter

That's about it for now.  All told, that implements decoding and display of
plain-text PalmDOC (up-converted to HTML), and HTML Plucker, minus images.  I 
haven't
made any attempt to implement any of the directory based things mentioned 
above.  I
also have a sinking suspicion that getting image support is going to require 
writing
out the JPEG's to the filesystem unless the WebKit view (I assume it's WebKit) 
can be
convinced to load image resources from a stream or something.

Odds are there are big nasty, iPhone eating bugs in this code, and it probably 
leaks
like a sieve as well.

All original code was GPL2, which of course makes my changes implicitly GPL2 as 
well.
 Just for clarity, I explicitly place any changes I've made under GPL2.  I'd also be
willing to assign copyright, depending on how you want to handle such things.

Google choked on my tarball when I tried to attach it, so for want of a better 
option:
http://www.thebedells.org/books-palmsupport.tbz

Original comment by pendorbo...@gmail.com on 15 Sep 2007 at 6:00

GoogleCodeExporter commented 9 years ago
I'm having some troubles compiling this: ld fails with an undefined symbol of 
___eprintf.  Could you send your 
email addy to zach AT brewstergeisz DOT cjb DOT net?  Thanks.

Original comment by roosters...@gmail.com on 15 Sep 2007 at 3:44

GoogleCodeExporter commented 9 years ago
Just a few notes for building libjpeg as I try to get this building on my other 
Mac:

configure:
CC=arm-apple-darwin-gcc ./configure --prefix=${HEAVENLY}/usr/local
--build=i386-apple-darwin --host=arm-apple-darwin --enable-static 
--enable-shared

configure couldn't run ltconfig without some help.  Run ltconfig manually after
configure:
./ltconfig --no-verify ./ltmain.sh arm-apple-darwin

make
sudo make install

install might complain about some missing directories in $HEAVENLY.  Just 
create them
by hand, then sudo make install again.  Rinse / repeat until all missing dirs 
are
created and all files installed.

Original comment by pendorbo...@gmail.com on 15 Sep 2007 at 11:36

GoogleCodeExporter commented 9 years ago
Take two:  The Palm code is cleaned up a bunch (stripped debugging output & 
removed a
ton of unused functions).  Hopefully this compiles better?  A binary is 
included just
in case.

http://www.thebedells.org/iphoneebooks-palm2.tbz

Original comment by pendorbo...@gmail.com on 16 Sep 2007 at 3:25

GoogleCodeExporter commented 9 years ago
Here's a poser for you.  Some of the plucker files I've looked at appear to be 
displaying out-of-order; I looked at 
Gutenberg's copy of Cory Doctorow's Down and Out in the Magic Kingdom, as well 
as Plucker's own FAQ.  Is 
there a foulup with the way the record order is determined?

Original comment by roosters...@gmail.com on 18 Sep 2007 at 6:04

GoogleCodeExporter commented 9 years ago
Entirely possible things are backwards.  The Plucker->HTML code I used is 
actually
from Plucker's source distribution, but it had a pile of crashes and other 
issues I
fixed.  Some were the result of parsing bad documents, but one would think the
Plucker FAQ would be well formed.

I'll grab the FAQ document and step through the code on my Mac.  At least the 
file
format is well documented...

Original comment by pendorbo...@gmail.com on 18 Sep 2007 at 6:25

GoogleCodeExporter commented 9 years ago
I've upgraded my toolchain and checked your changes into the branches/books2 
svn.  Please take a look, as I 
had to make some modifications to get it to compile (specifically, 
_plkr_message didn't exist.)  I also have done 
some work on my source files.

Original comment by roosters...@gmail.com on 19 Sep 2007 at 5:39

GoogleCodeExporter commented 9 years ago
That branch seems to work fine.  I had to make two changes to compile, though.

In EBookView.h, the include of UIKit/UIWebView.h causes mass spewage (see 
below). 
Commenting that include allows it to compile and run fine.  I had to do the 
same in
trunk before, but forgot to mention it.

In Makefile, the link step references libjpeg.a instead of palm/libjpeg.a.

I haven't looked into the out-of-order Plucker thing yet, but on cursory 
examination,
this build is fine.  I'm running it on my phone & will report any issues, but I 
don't
expect any problems.

This is what UIWebView.h causes:
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:11:
error: syntax error before '<' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:16:
error: syntax error before 'WebView'
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:31:
error: syntax error before 'UITextLoupe'
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:35:
error: syntax error before 'UIAutoscrollTimer'
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:43:
error: syntax error before 'DOMNode'
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:59:
error: syntax error before 'DOMNode'
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:62:
error: syntax error before 'WebPDFView'
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:64:
error: syntax error before '}' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:78:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:79:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:81:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:82:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:83:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:84:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:85:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:86:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:87:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:88:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:89:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:90:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:91:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:92:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:93:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:94:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:95:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:96:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:97:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:98:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:99:
error: syntax error before ':' token
/opt/local/bin/../lib/gcc/arm-apple-darwin/4.0.1/../../../../arm-apple-darwin/in
clude/UIKit/UIWebView.h:100:
error: syntax error before ':' token

Original comment by pendorbo...@gmail.com on 19 Sep 2007 at 6:29

GoogleCodeExporter commented 9 years ago
There's now a UIWebView patch in SVN which fixes that.  Basically, comment out 
all references to protocols.

Original comment by roosters...@gmail.com on 20 Sep 2007 at 2:12

GoogleCodeExporter commented 9 years ago
Zac, I've merged your changes into trunk, and Plucker/DOC support will be 
released in 1.1.

Original comment by roosters...@gmail.com on 26 Sep 2007 at 4:52

GoogleCodeExporter commented 9 years ago
Great!  I emailed you back the other day, BTW.  Hopefully it made it alive this 
time?
 I'm guessing 'yes' based on my real name in your last response. =)

Just for anyone who stumbles on this issue, I've found a bunch more *.pdb book
formats than I previously knew about here:
http://www.handebooks.com/formats/palmformats.html

Looks like my Palm book collection contains most of these.  If anyone gets a 
message
about "unknown PDB magic", please post the string and if possible anything you 
might
know about the program that generated the file.

I wonder... Could we get a Wiki page running for this?  If there's a way for me 
to
create one, I don't see it.

Magic I've found so far:

 * TEXtREAd - Standard PalmDOC -- supported in 1.1
 * DataPlkr - Plucker (http://www.plkr.org/) -- partial support in 1.1 - no links or
graphics
 * ToGoToGo - iSilo (http://www.isilo.com/) -- closed source format, no docs available
 * PNRdPPrs - PalmReader/PeanutPress/eReader (http://www.ereader.com/) -- closed
source format, usually with DRM, no docs available
 * MobiPocket - Mobi Pocker
 * TEXtTIDc - TealDoc
 * ToRaTRPW - Tome Rader

I don't currently have any documents encoded in those last three, but if anyone 
does
and could send them to my gmail account, I'd appreciate it.  I think at least 
one of
those is really PalmDOC with minor if any modifications.  If that's the case, I 
just
need to open up the format detection code to allow those formats to go through 
the
existing PalmDOC decoder.

Original comment by pendorbo...@gmail.com on 27 Sep 2007 at 3:28

GoogleCodeExporter commented 9 years ago
Lots of DRM-free, public MobiPocket documents here:

http://www.baen.com/library/defaultTitles.htm

Several formats available there, if you want to also try supporting 
Rocket/Ebookwise,
Microsoft Reader, RTF, or if you want to try to directly support from Baen's 
HTML zip's.

Original comment by tkeph...@gmail.com on 9 Oct 2007 at 6:56

GoogleCodeExporter commented 9 years ago
when will be the issue for opening a simple PDB file will be solved? hopefully 
we can
now read pdb file.

Original comment by ridenrac...@gmail.com on 30 Oct 2007 at 10:29

GoogleCodeExporter commented 9 years ago
when will be the issue for opening a simple PDB file will be solved? hopefully 
we can
now read pdb file.

Original comment by ridenrac...@gmail.com on 30 Oct 2007 at 10:30

GoogleCodeExporter commented 9 years ago
PDB isn't a file type, it's a container.  See comment 18 above.

Which file type within the PDB container are you referring to as the simple 
type?  If PalmDOC or Plucker, they 
were implemented in 1.1 and later.  If not, which file type, and is it one of 
the formats listed in comment 18, 
or an additional type?

Do you have a URL to a sample file of the type in question?

Is it a proprietary DRM encrypted format? In which case, it probably won't ever 
be supported, or an open 
format?  

What were you using to previously read it, and is it an open source project, or 
has a documented format?

Lots of variables, there's nothing simple about a PDB file.

Original comment by tkeph...@gmail.com on 30 Oct 2007 at 10:57

GoogleCodeExporter commented 9 years ago
Hope this can read a PDB FILE like the isilo reader. If someone can do this i 
will
give a 20$  donation of this

Original comment by ridenrac...@gmail.com on 3 Nov 2007 at 3:01

GoogleCodeExporter commented 9 years ago
iSilo is probably the one format *least* likely to be supported by Books.  The
developers of iSilo use a proprietary file format and have refused to share it 
with
anyone else.  If you'd like to see iSilo support, I'd recommend contacting the 
iSilo
developers and asking them to open their format.  

Maybe if they get enough requests, they'll consider changing their mind.

Failing that, if anyone has any reverse engineering talents and could undertake 
a
clean room reverse of the iSilo format, I'd love to see it! =)

Original comment by pendorbo...@gmail.com on 3 Nov 2007 at 4:55

GoogleCodeExporter commented 9 years ago
i understand post 18. does anyone know how to convert an isilo PDB to a 
different 
format that can be read by books.app?

Original comment by mario.ra...@world-direct.at on 7 Nov 2007 at 9:30

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
RE: Comment 25:

See comment 22: 
"PDB isn't a file type, it's a container.  See comment 18 above."

and comment 18:
" ToGoToGo - iSilo (http://www.isilo.com/) -- closed source format, no docs 
available"

See iSilo forum comment:
http://forum.isilo.com/showthread.php?t=733&page=1&pp=10

Original comment by c...@www.com on 11 Nov 2007 at 4:04

GoogleCodeExporter commented 9 years ago
Just a little bit longer and we should have iSilo available on our iPhones, 
email from iSilo:

On Nov 8, 2007, at 9:46 AM, iSilo wrote:

We are aware that Apple has made the announcement about the SDK being available 
in February and when it 
does become available, we will look into it.  We would very much like to have a 
version for the iPhone/iTouch.

Original comment by c...@www.com on 11 Nov 2007 at 4:06

GoogleCodeExporter commented 9 years ago
Comment 29 : iSilo on the iPhone is the one app that I would really like to see.
Trying to read documents on my Symbian S60 is not easy, but the iPhone has a 
much
more readable screen.

Original comment by davidkeaveny on 7 Dec 2007 at 2:20

GoogleCodeExporter commented 9 years ago
Just to update this thread a bit.  I've put some effort into reversing the iSilo
format using the 010 Editor (http://www.sweetscape.com/010editor/).  Nothing 
worthy
to report at this point.  The compression/encoding scheme used in the files is 
beyond
me at the moment.

Original comment by pendorbo...@gmail.com on 19 Dec 2007 at 2:28

GoogleCodeExporter commented 9 years ago
When I try to read "The First Men in the Moon" from Project Gutenberg as a 
plucker
file, it doesn't render all of chapter 1, nor does it go any further.

http://www.gutenberg.org/cache/plucker/1013/1013

version 1.3.7-1

Original comment by jason.po...@gmail.com on 18 Mar 2008 at 3:48

GoogleCodeExporter commented 9 years ago

Original comment by pendorbo...@gmail.com on 11 Jun 2008 at 3:42

GoogleCodeExporter commented 9 years ago

Original comment by pendorbo...@gmail.com on 11 Jun 2008 at 3:44

GoogleCodeExporter commented 9 years ago

Original comment by pendorbo...@gmail.com on 3 Aug 2009 at 9:39