galaxy001 / theunarchiver

Automatically exported from code.google.com/p/theunarchiver
Other
0 stars 0 forks source link

Add support for SHK (Apple IIgs archives) #119

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Can you add the SHK archive for Apple II / IIgs files?

thanks

Original issue reported on code.google.com by webmas...@portugalparadies.de on 8 Jul 2008 at 4:12

GoogleCodeExporter commented 9 years ago
If someone writes the code for it, sure. Personally I have far too many other 
things
that have higher priority.

Original comment by paracel...@gmail.com on 21 Aug 2008 at 3:08

GoogleCodeExporter commented 9 years ago
Also: Submit test cases if you want this implemented.

Original comment by paracel...@gmail.com on 26 Nov 2008 at 2:11

GoogleCodeExporter commented 9 years ago

Original comment by paracel...@gmail.com on 26 Apr 2009 at 1:15

GoogleCodeExporter commented 9 years ago
I'm interested in working on adding nulib2 support in The Unarchiver. How 
should I proceed?

Original comment by mre...@gmail.com on 20 Jun 2010 at 9:43

GoogleCodeExporter commented 9 years ago
Well, the basic structure for archive support in The Unarchiver is to create a 
XAD*Parser class that detects the file format and parses the files it contains. 
Then, create one or more XAD*Handle files that implement the compression 
algorithms used. The XAD*Parser will instantiate one of the handle classes as 
needed. Unfortunately, I haven't gotten around to documenting this, so you have 
to look at another parser class and try to figure it out from there. You might 
want to look at some simple class, like NSA or cpio.

Then, add the class to the list in XADArchiveParser.m, and it should magically 
work. Use the XADTest2 and XADTest3 command-line utilities while testing, they 
are quite handy.

A few notes, though: I really want to avoid external dependencies in XADMaster 
as far as possible. If nulib2 is of reasonable size and can just be stuffed 
into a subdirectory, that should be fine, though. Also, the abstracted 
filehandle model for compression algorithms puts some limitations on what kinds 
of libraries you can reasonably use. Basically, data is read from archives in 
XADMaster using an interface similar to fread() and company. That means that 
the unpacking algorithm needs to be able to stop in mid-stream, and resume 
later. Some decompression libraries are not built like that, and will write the 
entire output in one go.

Well, in the worst case, files for the Apple II should never be big, so this 
can be kludged by unpacking the entire file to a memory buffer, and returning a 
CSMemoryHandle for reading from this buffer.

There are plenty of other utility classes for the handles, too. CSHandle is the 
superclass, but few classes inherit directly from this. CSStreamHandle 
implements an interface for non-seekable streams: A read function, and a reset 
function to restart from the start of the stream (and seeking is handled 
automatically by restarting if needed). This, in turn, has further convenience 
classes, such as CSByteStreamHandle for algorithms that want to return single 
bytes one by one, and CSBlockStreamHandle for algorithms that unpack in blocks.

Well, feel free to ask for further help once you have a closer look at it all.

Original comment by paracel...@gmail.com on 20 Jun 2010 at 10:02

GoogleCodeExporter commented 9 years ago
Thanks a lot!

Original comment by mre...@gmail.com on 22 Jun 2010 at 6:02

GoogleCodeExporter commented 9 years ago
Hello there!

I've finally managed to find some time and start working on this. I've been 
able to understand NuLib and NuFX, but I'm having a hard time figuring out the 
Unarchiver.

So far I've built a XADSHKParser class for .shk archives, and the file 
recognition is indeed working (tested with XADTest2). But I'm unable to 
understand the function of the [x parse] call. I'm also not quite able to 
understand the [x rawHandleForEntryWithDictionary] call. What do the entries of 
this dictionary represent? My biggest issue is understanding the point where 
the parser gets to start the extraction of the files.

About NuFX, the library supports individual file access à la fopen so I don't 
think it should be very hard to include in The Unarchiver. Some bad/good news: 
the library is approx. 560 kB, but I've checked and a lot can be left out as 
there is no need to build/modify archives. Also, a lot of the functionality is 
repeated in XADMaster.

One last issue, NuFX can decode Apple IIGS .sea archives, that unfortunately 
are not compatible with Macintosh .sea archives (and which the Unarchiver can 
decode). Is it possible to have overlapping file extensions in the Unarchiver?

Finally, I wish I had more time to understand how The Unarchiver works, but I 
guess it's easier to just ask for help.

Cheers,

Marc.-

Original comment by mre...@gmail.com on 29 Jun 2010 at 10:18

GoogleCodeExporter commented 9 years ago
All right, the program flow basically goes like this (don't rely on me getting 
all the method names correct here):

* Client program requests an archive parser for a file.
-> XADArchiveParser reads the first part of the file, and then goes through the 
list of available parsers, running their detection method, until it finds one 
that matches.

* Client program sets a delegate for the parser, which will receive all the 
information from the parser.

* Client program calls [parser parse] to start parsing the archive.
-> The subclass starts scanning through the archive.
-> The subclass builds dictionaries for each entry with as many of the standard 
keys it can figure out, and whatever else it needs to keep around for later 
(like compression methods and settings, or anything else it wants to make 
optionally available).
-> The subclass calls [self addEntryWithDictionary:] for each entry it has 
created. This causes some further processing of the dictionary to automatically 
fill in certain details, and then the dictionary is passed to the client 
program's delegate class.

* The client program's delegate class either stores the dictionary for later, 
or it calls [parser handleForDictionary:] to receive a handle to decompress the 
file.
-> The subclass may therefore still be calling [self addEntryWithDictionary:] 
when its [self handleForDictionary:] is called, or it may happen after parsing 
has ended.
-> The subclass instantiates a suitable decompression handle for the entry it 
is given in handleForDictionary:
-> The subclass usually uses a convenience method to create a sub-handle that 
reads just from the part of the file that contains the data stream, like 
handleAtDataOffsetForDictionary which uses the XADDataOffset and XADDataLength 
keys to identify the area to use.

* The client program reads data from the handle.
-> When addEntryWithDictionary: finally returns, the file pointer has probably 
moved, because of the reading. Either be aware of this, or use the 
retainFilePosition: argument to have it automatically restored.

* Once the archive is entirely parsed, the subclass exits its parse method. The 
client might call handleForDictionary: after this point, too.

As for the other issues:

* rawHandleForEntryWithDictionary: is used by XADMacArchiveParser, which is a 
convenience class for archives that may contain ditto or MacBinary-format 
files. SHK is not one of those, so you should not use it. The name is a bit 
confusing, it's not used for ALL Mac archive format, but more likely for 
non-Mac archive formats using tricks for Mac files. Implement 
handleForDictionary: instead.

* It is highly discouraged to look at file extensions at all. XADMaster 
supports unpacking from any abstract stream, so there might not even be a 
filename. When possible the detection method should always look at the file 
contents it is passed for magic numbers or use other heuristics to find out if 
the file is one it can handle. Only when this is entirely impossible should it 
look for file extensions. Almost all archive parsers do this, and it avoids 
problems with filename conflicts.

Original comment by paracel...@gmail.com on 29 Jun 2010 at 10:54