fadden / CiderPress2

Tool for working with Apple II and vintage Mac disk images and file archives.
https://ciderpress2.com/
Apache License 2.0
41 stars 6 forks source link

Enhancement Request: Add ability to export CFFA1 disk image in woz monitor loadable format #17

Closed tmcintos closed 2 months ago

tmcintos commented 3 months ago

It would be great if there were a way to export the files in the CFFA1 disk image using the hex input format of the Apple-1's Woz monitor:

http://retro.hansotten.nl/uploads/apple1/ULTIMATE%20APPLE1%20CFFA%203.3.po

It seems that this disk image contains binary images of the programs, with the load address specified by the Auxtyp field in the catalog listing in the case of file types $F1 and BIN:

Type Auxtyp Modified          Length  Storage *Name                             
---- ------ --------------- -------- -------- ----------------------------------
DIR         30-Sep-12 19:48      512      512  ASOFT
BIN  $6000  [No Date]           8037     8704  ASOFT:APPLESOFT
$F8  $0801  [No Date]           7160     7680  ASOFT:LEMO
DIR         30-Sep-12 19:48     2048     2048  BASIC
$F1  $0800  [No Date]           2560     3072  BASIC:ACEYDUCEY
$F1  $0300  [No Date]          16128     7168  BASIC:AMAZING
BIN  $E000  06-Oct-06 15:45     4096     4608  BASIC:BASIC
$F1  $0800  [No Date]           2560     3072  BASIC:BATNUM
$F1  $0800  [No Date]           2560     3072  BASIC:BLACKJACK
$F1  $0300  [No Date]           3840     4608  BASIC:BOWLING
$F1  $0300  [No Date]           3840     4608  BASIC:BUZZWORD
$F1  $02BC  [No Date]          16196    16896  BASIC:CHECKERS
$F1  $0800  [No Date]           2560     3072  BASIC:CHEMIST
$F1  $0300  [No Date]          16128    16896  BASIC:COLUMN
$F1  $0300  [No Date]          16128    16896  BASIC:CONCENTRATION
$F1  $0300  [No Date]           3840     4608  BASIC:CRAPS
$F1  $0300  [No Date]          16128    16896  BASIC:DEAL
$F1  $0300  [No Date]          16128    16896  BASIC:ELIZA
$F1  $0200  [No Date]          32767    33280  BASIC:FACTORIALS
$F1  $0800  [No Date]          10752    11264  BASIC:FOOTBALL
$F1  $0800  [No Date]           2560     2048  BASIC:GETKEY
$F1  $0300  [No Date]           3840     4608  BASIC:GOMOKO
$F1  $0300  [No Date]           3840     4608  BASIC:HAMMURABI
$F1  $0400  [No Date]           7680     8192  BASIC:HANGMAN
$F1  $0800  [No Date]           5464     6144  BASIC:HOROSCOPE.H7000
$F1  $0400  [No Date]           7680     8192  BASIC:HURKLE
$F1  $0800  [No Date]           2560     3072  BASIC:INTEGER.MATH
$F1  $0400  [No Date]           7680     4096  BASIC:LABYRINTH
$F1  $0300  [No Date]           3840     4608  BASIC:LUNARLANDER
$F1  $0800  [No Date]           2560     3072  BASIC:MATCHES
$F1  $0800  [No Date]           2560     2048  BASIC:MATRIX
$F1  $0800  [No Date]           2560     3072  BASIC:NICOMA
$F1  $0800  [No Date]           2560     3072  BASIC:NUMBER
$F1  $0800  [No Date]           2560     3072  BASIC:PRIME.FINDER
$F1  $0300  [No Date]           7936     8704  BASIC:QUEEN
$F1  $0800  [No Date]           2560     3072  BASIC:RESISTOR
$F1  $0800  [No Date]           2560     2048  BASIC:REVERSE
$F1  $0800  [No Date]           2560     2048  BASIC:ROCKPAPERSCIS
$F1  $0300  [No Date]           3840     4608  BASIC:SLOTS
$F1  $0800  [No Date]           2560     3072  BASIC:SQUARES
$F1  $0300  13-Oct-06 13:16     3840     4608  BASIC:STARTREK
$F1  $0300  [No Date]          16128    16896  BASIC:STARTREK2003
$F1  $0800  [No Date]           2560     2560  BASIC:STOPWATCH
$F1  $0300  13-Oct-06 13:16     3840     4608  BASIC:SUDOKU
$F1  $0800  [No Date]           2560     3072  BASIC:TICTACTOE
$F1  $0800  [No Date]           2560     3072  BASIC:WORD
$F1  $0300  [No Date]          16128    16384  BASIC:WORDSEARCH
$F1  $0400  13-Oct-06 13:16     7680     8192  BASIC:WUMPUS
$F1  $0800  [No Date]           4464     4608  BASIC:ZOOP.H6000
DIR         30-Sep-12 19:48     1536     1536  FORTH
TXT  $0000  12-Oct-10 22:14     1024     1536  FORTH:BLOCK0001
TXT  $0000  12-Oct-10 22:14     1024     1536  FORTH:BLOCK0004
TXT  $0000  21-Sep-10 21:13     1024     1536  FORTH:BLOCK0005
TXT  $0000  23-Sep-10 22:45     1024     1536  FORTH:BLOCK000F
TXT  $0000  25-Sep-10 16:19     1024     1536  FORTH:BLOCK0050
TXT  $0000  25-Sep-10 16:19     1024     1536  FORTH:BLOCK0051
TXT  $0000  25-Sep-10 16:19     1024     1536  FORTH:BLOCK0052
TXT  $0000  25-Sep-10 16:19     1024     1536  FORTH:BLOCK0053
TXT  $0000  25-Sep-10 16:19     1024     1536  FORTH:BLOCK0054
TXT  $0000  25-Sep-10 16:19     1024     1536  FORTH:BLOCK0055
TXT  $0000  25-Sep-10 16:19     1024     1536  FORTH:BLOCK0056
TXT  $0000  25-Sep-10 13:56     1024     1536  FORTH:BLOCK00C8
TXT  $0000  25-Sep-10 13:56     1024     1536  FORTH:BLOCK00C9
TXT  $0000  25-Sep-10 13:56     1024     1536  FORTH:BLOCK00CA
TXT  $0000  25-Sep-10 13:56     1024     1536  FORTH:BLOCK00CB
TXT  $0000  25-Sep-10 13:56     1024     1536  FORTH:BLOCK00CC
TXT  $0000  12-Oct-10 22:14     1024     1536  FORTH:BLOCK00CD
TXT  $0000  25-Sep-10 13:56     1024     1536  FORTH:BLOCK00CE
TXT  $0000  25-Sep-10 13:56     1024     1536  FORTH:BLOCK00CF
TXT  $0000  25-Sep-10 13:56     1024     1536  FORTH:BLOCK00D0
TXT  $0000  25-Sep-10 13:56     1024     1536  FORTH:BLOCK00D1
TXT  $0000  25-Sep-10 13:56     1024     1536  FORTH:BLOCK00D2
TXT  $0000  25-Sep-10 13:56     1024     1536  FORTH:BLOCK00D3
TXT  $0000  25-Sep-10 13:56     1024     1536  FORTH:BLOCK00D4
TXT  $0000  25-Sep-10 13:56     1024     1536  FORTH:BLOCK00D5
TXT  $0000  25-Sep-10 13:56     1024     1536  FORTH:BLOCK00D6
TXT  $0000  12-Oct-10 22:14     1024     1536  FORTH:BLOCK00D7
TXT  $0000  12-Oct-10 22:14     1024     1536  FORTH:BLOCK00D8
TXT  $0000  25-Sep-10 13:56     1024     1536  FORTH:BLOCK00D9
TXT  $0000  25-Sep-10 13:56     1024     1536  FORTH:BLOCK00DA
NON  $0000  31-Oct-10 14:39     6148     3072  FORTH:DSSTORE
BIN  $0240  31-Oct-10 14:05     6535     7168  FORTH:F1
TXT  $0000  31-Oct-10 14:04    57461    58368  FORTH:F1.S
TXT  $0000  10-Sep-05 00:19    14435    15360  FORTH:FIGDOC.TXT
TXT  $0000  21-Apr-07 07:46    53170    53760  FORTH:GLOSSARY.TXT
TXT  $0000  12-Sep-10 07:57       59      512  FORTH:NAME
DIR         30-Sep-12 19:48     1024     1024  LANGS
BIN  $7000  [No Date]           4045     4608  LANGS:A1ASM.V1.7000
BIN  $E000  [No Date]           4045     4608  LANGS:A1ASM.V1.E000
BIN  $6000  [No Date]           8037     8704  LANGS:APPLESOFT
BIN  $E000  06-Oct-06 15:45     4096     4608  LANGS:BASIC
BIN  $0800  05-Oct-06 17:08      512      512  LANGS:DISASSEMBLER
BIN  $5800  [No Date]          10190    10752  LANGS:EBASIC110L.5800
BIN  $5800  [No Date]          10190    10752  LANGS:EBASIC110U.5800
BIN  $5000  [No Date]          10411    11264  LANGS:EBASIC222L.7825
BIN  $5000  [No Date]          10411    11264  LANGS:EBASIC222U.7825
BIN  $5000  [No Date]          10706    11264  LANGS:EBASIC222CFFAL
BIN  $5000  [No Date]          10706    11264  LANGS:EBASIC222CFFAU
BIN  $0300  06-Oct-06 16:39     6294     7168  LANGS:FIGFORTH110
BIN  $0280  [No Date]           9772    10240  LANGS:JMON1.0
BIN  $7100  [No Date]           3839     4608  LANGS:KRUSADER1.2
BIN  $7100  [No Date]           3512     4096  LANGS:KRUSADER1.3
BIN  $6000  [No Date]           8162     8704  LANGS:MSBASIC.7D0D
BIN  $0300  [No Date]          14255    14848  LANGS:VOLKSFORTH38
BIN  $02E2  [No Date]           2079     3072  LANGS:SA2K.06B8
BIN  $7600  [No Date]           2489     3072  LANGS:TBASIC10L
BIN  $7600  [No Date]           2489     3072  LANGS:TBASIC10U
BIN  $1000  [No Date]          13449    14336  LANGS:ULTRAFORTH83
BIN  $1000  [No Date]          13472    14336  LANGS:VOLKSFORTH
DIR         30-Sep-12 19:48     1024     1024  MCODE
BIN  $0280  [No Date]          11656    12288  MCODE:ADVENTURE
BIN  $0280  [No Date]           3456     4096  MCODE:APPLE30TH
BIN  $0300  [No Date]            557     1536  MCODE:CELLULAR
BIN  $E000  [No Date]           4021     4608  MCODE:CODEBREAKER.PT1
BIN  $0280  [No Date]           2179     3072  MCODE:CODEBREAKER.PT2
BIN  $0280  [No Date]           6199     7168  MCODE:CODEBREAKER.REP
BIN  $0280  [No Date]             23      512  MCODE:HELLOWORLD
BIN  $0300  13-Oct-06 13:16      431      512  MCODE:HUNDRED.0474
BIN  $2000  06-Oct-06 16:42      440      512  MCODE:LIFE
BIN  $0300  [No Date]           4541     5120  MCODE:LITTLETOWER
BIN  $0300  02-Oct-06 17:07     5817     2560  MCODE:LUNARLANDER
BIN  $0300  06-Oct-06 17:02      177      512  MCODE:MASTERMIND
BIN  $0300  [No Date]           2248     3072  MCODE:MICROCHESS1
BIN  $1000  [No Date]           2304     3072  MCODE:MICROCHESS2
BIN  $0300  [No Date]            500      512  MCODE:NIM.04AF
BIN  $0300  [No Date]            550     1536  MCODE:PASART
BIN  $0280  [No Date]           4794     5632  MCODE:TYPINGTUTOR
BIN  $0280  [No Date]          17072    17920  MCODE:YUM
BIN  $1D00  [No Date]           3636     4608  MCODE:WOZFP.2394
DIR         30-Sep-12 19:48      512      512  UTILS
BIN  $0280  [No Date]            290      512  UTILS:MEMORYTEST
BIN  $07FD  [No Date]           2051     3072  UTILS:MONITOR
BIN  $0300  [No Date]             16      512  UTILS:TEST
fadden commented 3 months ago

Can you tell me more about what it is you're trying to do?

If the goal is to get the output into a specific hex dump format, we may be able to do that with something like "xxd". For example:

% cat foo
This is a test of xxd dumps.
% xxd -o 768 -u -g1 foo | cut -c5-57
0300: 54 68 69 73 20 69 73 20 61 20 74 65 73 74 20 6F
0310: 66 20 78 78 64 20 64 75 6D 70 73 2E 0A         
tmcintos commented 2 months ago

The intent was to extract the binaries from the disk mage and load them into an Apple-1 replica or emulator via the console.

This would involve formatting the binaries as a hexdump in “Woz monitor” deposit command syntax, with the proper load address taken from the Auxtype field.

For an example of what this usually looks like, see the .txt files here:

https://linuxcoffee.com/apple1/software/notes.html

I suppose it may be possible to accomplish this with some scripting/xxd magic around cp2, but I was thinking it would be nice if there were a builtin converter for export that just did it without additional tooling. I would suspect it may be useful for Apple-1 enthusiasts.

fadden commented 2 months ago

There's a bit of variation in the files, e.g.:

All of them end with an execution command, either

R or, for the BASIC programs, E2B3R and RUN. The files on https://linuxcoffee.com/apple1/software/ftp/ are pretty varied in format, often with longer lines, and generally lack the execution command.

So I'm a little unclear on what exactly needs to be output for any given file, or where the stuff at 004A comes from.

Mechanically, this would be easiest to do as an export format that isn't part of the "best" set. Something like the existing hex-dump conversion, which needs to be chosen by the user rather than selected automatically based on file characteristics. This would be a little awkward to use in the GUI -- you'd need to open the entry in the file viewer, select the format, and click Export to save it -- but easy in the CLI.

tmcintos commented 2 months ago

Thanks for looking into this!

There's a bit of variation in the files [...] So I'm a little unclear on what exactly needs to be output for any given file [...]

Right, because this is essentially just a script to be fed into the ROM monitor ("wozmon"), which has a fairly flexible command syntax (similar to Apple II, but different and more minimalist). Once you've specified the initial memory location in a deposit command, it auto-increments, so you don't need to specify it in subsequent deposit commands, which may simply begin with a colon. Taking advantage of this and omitting unnecessary addresses would improve performance when loading via the Apple-1 console.

You can find an overview of the command syntax on pages 3-4 of the Apple-1 manual (though keep in mind there are a few minor errors in that manual):

http://apple1.chez.com/Apple1project/Docs/pdf/AppleI_Manual.pdf

or on the SB-projects page I linked above (I'm sure there are other good references, but these are my "go to" ones):

https://www.sbprojects.net/projects/apple1/wozmon.php

I believe the hex values for each byte need to be separated by at least one space, and the command length is only limited by the size of the key input buffer, which I believe is 128 bytes ($0200..$027F) IIRC, though I think it is conventional to have only 8 bytes per line; I have seen files with 16-bytes per line, but this would exceed the width of the 40-character wide display and cause wrapping, so it's probably best to stick to 8 bytes per line.

All of them end with an execution command, either R or, for the BASIC programs, E2B3R and RUN.

Yes, though if you look at files distributed elsewhere on the Internet, this is not always the case. It is mostly a user convenience, except in the case of a machine language program with an entry point not equal to the load address, which I've never found an example of. In my emulator, I've implemented heuristics to automatically append the appropriate run command if it is missing.

or where the stuff at 004A comes from.

As you've surmised, these are dumps of BASIC programs already loaded in memory. Apple-1 BASIC apparently stores some state in the zero page starting at $4A, with the program typically starting at $0800, so when people manually dump these it's typical to save $4A..$FF and $0800 to HIMEM.

From https://www.sbprojects.net/projects/apple1/a1basic.php:

Now you must save two address ranges, one from $4A to $FF, and one from $0800 to $0FFF.

This is essentially LOMEM..HIMEM, as detailed here: http://jefftranter.blogspot.com/2012/03/lomem-himem-and-saving-basic-programs.html

However, for extracting the binary images from the disk image referenced above, I have assumed that these details of dumping BASIC programs do not matter.

My assumption has been that the files of type BIN, $F1 and $F8 were all essentially stored as a raw binary memory image, with load address specified in the AuxType field and the file type only distinguishing what sort of run command was needed to start the program (typically R command at load address for machine language programs, or E23BR [BASIC warm start] and RUN for BASIC programs, as you noted above).

However, perhaps there's more to it and further reverse engineering may be required, at least in the case of BASIC images. There is only one file of type $F8 and I'm not sure how that differs from type $F1. Also, it seems the (presumed) load address + file size exceeds 32KiB (maximum RAM for an Apple-1) in some cases (e.g. BASIC:FACTORIALS), so I don't know whether that is an error, or a subtlety that I don't yet understand.

Mechanically, this would be easiest to do as an export format that isn't part of the "best" set. Something like the existing hex-dump conversion, which needs to be chosen by the user rather than selected automatically based on file characteristics. This would be a little awkward to use in the GUI -- you'd need to open the entry in the file viewer, select the format, and click Export to save it -- but easy in the CLI.

That's along the lines of what I was thinking. I'm using the CLI on a Mac, so I don't know how this works in the GUI (Having a Mac GUI would be awesome, BTW ;).

fadden commented 2 months ago

This seems straightforward to implement. However...

All of the $F1 files I've looked at appear to be captured from memory offset $0000. If you look at the hex dumps you'll notice e.g.:

0000f0: 00 01 7c 01 00 00 ff ff 00 00 00 00 00 20 00 ed ··|·········· ·m

That "20 00 ed" at the end also appears at $00fd in some of the BASIC pieces that start at $004a. The file also has a large stretch of uninitialized memory (alternating 00 00 ff ff) up to $1000, so it's clearly a memory dump. However, the stuff at $0200 is ASCII text, not executable code, so the $0200 aux type doesn't seem to indicate a load address or a start address.

If you change the file type of the $F8 file to $FC (BAS), you will get a formatted Applesoft BASIC program... almost. The keywords aren't the same, so it looks a bit screwy:

 10  DRAW  <<< LEMONADE STAND >>>
 15  DRAW 
 20  DRAW  FROM AN ORIGINAL PROGRAM
[...]
620  ROT= "(YOUR MOTHER QUIT GIVING YOU FREE SUGAR)"
tmcintos commented 2 months ago

However, the stuff at $0200 is ASCII text, not executable code, so the $0200 aux type doesn't seem to indicate a load address or a start address.

Good observation. $0200 is the key input buffer, so ASCII makes sense there, but not sure about the AuxType value's meaning. In most cases, I thought it looked like the LOMEM value for BASIC files. Perhaps LOMEM was set incorrectly?

If you change the file type of the $F8 file to $FC (BAS), you will get a formatted Applesoft BASIC program... almost. The keywords aren't the same, so it looks a bit screwy:

Is it tokenized Integer BASIC? I think the Apple(-1) BASIC tokens are the same, but I forget.

fadden commented 2 months ago

Integer BASIC has a different structure (see https://github.com/fadden/CiderPress2/blob/main/FileConv/Code/BASIC-notes.md). According to https://retrocomputing.stackexchange.com/a/395/56 there was an early version of Applesoft that lacked the hi-res commands, but the ASOFT/APPLESOFT binary on the disk image doesn't have the lo-res commands either. I'm guessing it's an earlier version of Microsoft BASIC, since it has the same file format and appears to call $00B1 to fetch data, but the work on Applesoft didn't start until the Apple II was released (https://www.apple2history.org/history/ah16/#04).

tmcintos commented 2 months ago

Oh right, I had momentarily forgotten that file was in the ASOFT folder, sorry.

The AppleSoft version for Apple-1 is a modern port called AppleSoft Lite. Source code is available here: https://cowgod.org/replica1/applesoft/. The author’s notes confirm your findings:

Tokenized Applesoft files from an Apple II will not work, nor will Applesoft Lite files work on an Apple II. If you'd like to try and load an Apple II program, use the source listing rather than a SAVEd file. Unsupported tokens will result in syntax errors.

Tokenized Integer files will not work either, but the source code may work to some degree.

fadden commented 2 months ago

Here's what I'm thinking:

While this can be automated, it's sounding less and less like something that should be done by a CP2 file converter. It's more like a python script that takes some arguments about program type and location, and you run it a few times until everything looks right.

I'm also a little concerned about whether people will understand what to do with the output. When I generate a text listing or PNG file, it's pretty obvious. An Apple I monitor hex dump requires some background, and if I start getting questions I won't know the answers.

tmcintos commented 2 months ago

Makes sense. Along the lines of what you are thinking, for the $F1 file type, I still think the AuxType field represents the LOMEM value. I.e., where to start loading data after skipping the stack page and key input buffer at $0200-$027F. Data between there and LOMEM is presumably unused, so not useful or necessary to copy in.

For the BIN file type I'm certain that the AuxType field represents the load and run address, based on several known programs that I recognize.

So to summarize, I suspect it's this:

Maybe I'll try to script something if/when I find the time. Thanks for looking into this.