Closed tmcintos closed 6 months ago
Can you tell me more about what it is you're trying to do?
If the goal is to get the output into a specific hex dump format, we may be able to do that with something like "xxd". For example:
% cat foo
This is a test of xxd dumps.
% xxd -o 768 -u -g1 foo | cut -c5-57
0300: 54 68 69 73 20 69 73 20 61 20 74 65 73 74 20 6F
0310: 66 20 78 78 64 20 64 75 6D 70 73 2E 0A
The intent was to extract the binaries from the disk mage and load them into an Apple-1 replica or emulator via the console.
This would involve formatting the binaries as a hexdump in “Woz monitor” deposit command syntax, with the proper load address taken from the Auxtype field.
For an example of what this usually looks like, see the .txt files here:
https://linuxcoffee.com/apple1/software/notes.html
I suppose it may be possible to accomplish this with some scripting/xxd magic around cp2, but I was thinking it would be nice if there were a builtin converter for export that just did it without additional tooling. I would suspect it may be useful for Apple-1 enthusiasts.
There's a bit of variation in the files, e.g.:
All of them end with an execution command, either
R or, for the BASIC programs, E2B3R and RUN. The files on https://linuxcoffee.com/apple1/software/ftp/ are pretty varied in format, often with longer lines, and generally lack the execution command.So I'm a little unclear on what exactly needs to be output for any given file, or where the stuff at 004A comes from.
Mechanically, this would be easiest to do as an export format that isn't part of the "best" set. Something like the existing hex-dump conversion, which needs to be chosen by the user rather than selected automatically based on file characteristics. This would be a little awkward to use in the GUI -- you'd need to open the entry in the file viewer, select the format, and click Export to save it -- but easy in the CLI.
Thanks for looking into this!
There's a bit of variation in the files [...] So I'm a little unclear on what exactly needs to be output for any given file [...]
Right, because this is essentially just a script to be fed into the ROM monitor ("wozmon"), which has a fairly flexible command syntax (similar to Apple II, but different and more minimalist). Once you've specified the initial memory location in a deposit command, it auto-increments, so you don't need to specify it in subsequent deposit commands, which may simply begin with a colon. Taking advantage of this and omitting unnecessary addresses would improve performance when loading via the Apple-1 console.
You can find an overview of the command syntax on pages 3-4 of the Apple-1 manual (though keep in mind there are a few minor errors in that manual):
http://apple1.chez.com/Apple1project/Docs/pdf/AppleI_Manual.pdf
or on the SB-projects page I linked above (I'm sure there are other good references, but these are my "go to" ones):
https://www.sbprojects.net/projects/apple1/wozmon.php
I believe the hex values for each byte need to be separated by at least one space, and the command length is only limited by the size of the key input buffer, which I believe is 128 bytes ($0200..$027F
) IIRC, though I think it is conventional to have only 8 bytes per line; I have seen files with 16-bytes per line, but this would exceed the width of the 40-character wide display and cause wrapping, so it's probably best to stick to 8 bytes per line.
All of them end with an execution command, either R or, for the BASIC programs, E2B3R and RUN.
Yes, though if you look at files distributed elsewhere on the Internet, this is not always the case. It is mostly a user convenience, except in the case of a machine language program with an entry point not equal to the load address, which I've never found an example of. In my emulator, I've implemented heuristics to automatically append the appropriate run command if it is missing.
or where the stuff at 004A comes from.
As you've surmised, these are dumps of BASIC programs already loaded in memory. Apple-1 BASIC apparently stores some state in the zero page starting at $4A
, with the program typically starting at $0800
, so when people manually dump these it's typical to save $4A..$FF
and $0800
to HIMEM
.
From https://www.sbprojects.net/projects/apple1/a1basic.php:
Now you must save two address ranges, one from $4A to $FF, and one from $0800 to $0FFF.
This is essentially LOMEM..HIMEM
, as detailed here: http://jefftranter.blogspot.com/2012/03/lomem-himem-and-saving-basic-programs.html
However, for extracting the binary images from the disk image referenced above, I have assumed that these details of dumping BASIC programs do not matter.
My assumption has been that the files of type BIN
, $F1
and $F8
were all essentially stored as a raw binary memory image, with load address specified in the AuxType
field and the file type only distinguishing what sort of run command was needed to start the program (typically R
command at load address for machine language programs, or E23BR
[BASIC warm start] and RUN
for BASIC programs, as you noted above).
However, perhaps there's more to it and further reverse engineering may be required, at least in the case of BASIC images. There is only one file of type $F8
and I'm not sure how that differs from type $F1
. Also, it seems the (presumed) load address + file size exceeds 32KiB (maximum RAM for an Apple-1) in some cases (e.g. BASIC:FACTORIALS
), so I don't know whether that is an error, or a subtlety that I don't yet understand.
Mechanically, this would be easiest to do as an export format that isn't part of the "best" set. Something like the existing hex-dump conversion, which needs to be chosen by the user rather than selected automatically based on file characteristics. This would be a little awkward to use in the GUI -- you'd need to open the entry in the file viewer, select the format, and click Export to save it -- but easy in the CLI.
That's along the lines of what I was thinking. I'm using the CLI on a Mac, so I don't know how this works in the GUI (Having a Mac GUI would be awesome, BTW ;).
This seems straightforward to implement. However...
All of the $F1 files I've looked at appear to be captured from memory offset $0000. If you look at the hex dumps you'll notice e.g.:
0000f0: 00 01 7c 01 00 00 ff ff 00 00 00 00 00 20 00 ed ··|·········· ·m
That "20 00 ed" at the end also appears at $00fd in some of the BASIC pieces that start at $004a. The file also has a large stretch of uninitialized memory (alternating 00 00 ff ff) up to $1000, so it's clearly a memory dump. However, the stuff at $0200 is ASCII text, not executable code, so the $0200 aux type doesn't seem to indicate a load address or a start address.
If you change the file type of the $F8 file to $FC (BAS), you will get a formatted Applesoft BASIC program... almost. The keywords aren't the same, so it looks a bit screwy:
10 DRAW <<< LEMONADE STAND >>>
15 DRAW
20 DRAW FROM AN ORIGINAL PROGRAM
[...]
620 ROT= "(YOUR MOTHER QUIT GIVING YOU FREE SUGAR)"
However, the stuff at $0200 is ASCII text, not executable code, so the $0200 aux type doesn't seem to indicate a load address or a start address.
Good observation. $0200 is the key input buffer, so ASCII makes sense there, but not sure about the AuxType value's meaning. In most cases, I thought it looked like the LOMEM value for BASIC files. Perhaps LOMEM was set incorrectly?
If you change the file type of the $F8 file to $FC (BAS), you will get a formatted Applesoft BASIC program... almost. The keywords aren't the same, so it looks a bit screwy:
Is it tokenized Integer BASIC? I think the Apple(-1) BASIC tokens are the same, but I forget.
Integer BASIC has a different structure (see https://github.com/fadden/CiderPress2/blob/main/FileConv/Code/BASIC-notes.md). According to https://retrocomputing.stackexchange.com/a/395/56 there was an early version of Applesoft that lacked the hi-res commands, but the ASOFT/APPLESOFT binary on the disk image doesn't have the lo-res commands either. I'm guessing it's an earlier version of Microsoft BASIC, since it has the same file format and appears to call $00B1 to fetch data, but the work on Applesoft didn't start until the Apple II was released (https://www.apple2history.org/history/ah16/#04).
Oh right, I had momentarily forgotten that file was in the ASOFT folder, sorry.
The AppleSoft version for Apple-1 is a modern port called AppleSoft Lite. Source code is available here: https://cowgod.org/replica1/applesoft/. The author’s notes confirm your findings:
Tokenized Applesoft files from an Apple II will not work, nor will Applesoft Lite files work on an Apple II. If you'd like to try and load an Apple II program, use the source listing rather than a SAVEd file. Unsupported tokens will result in syntax errors.
Tokenized Integer files will not work either, but the source code may work to some degree.
Here's what I'm thinking:
While this can be automated, it's sounding less and less like something that should be done by a CP2 file converter. It's more like a python script that takes some arguments about program type and location, and you run it a few times until everything looks right.
I'm also a little concerned about whether people will understand what to do with the output. When I generate a text listing or PNG file, it's pretty obvious. An Apple I monitor hex dump requires some background, and if I start getting questions I won't know the answers.
Makes sense. Along the lines of what you are thinking, for the $F1 file type, I still think the AuxType field represents the LOMEM
value. I.e., where to start loading data after skipping the stack page and key input buffer at $0200-$027F. Data between there and LOMEM
is presumably unused, so not useful or necessary to copy in.
For the BIN file type I'm certain that the AuxType field represents the load and run address, based on several known programs that I recognize.
So to summarize, I suspect it's this:
AuxType
.$0000
to HIMEM
, with LOMEM
given in AuxType
. Load from $4A..$FF
then from LOMEM
to EOF.Maybe I'll try to script something if/when I find the time. Thanks for looking into this.
It would be great if there were a way to export the files in the CFFA1 disk image using the hex input format of the Apple-1's Woz monitor:
http://retro.hansotten.nl/uploads/apple1/ULTIMATE%20APPLE1%20CFFA%203.3.po
It seems that this disk image contains binary images of the programs, with the load address specified by the
Auxtyp
field in the catalog listing in the case of file types$F1
andBIN
: