Open david-schmidt opened 2 years ago
Hi, I would like to look at these file format. Maybe, if these data are similar to brother's WP1-format (I made a converter), then we could try to interpret and convert....
Cool, I'll see if I can get the spreadsheet program rolling on a Brother locally (the data I have isn't mine to share), so hopefully we can take a deeper look.
I could probably rip this open on short notice if you skip me some files to disect.
ExampleSD2Files.zip Oh man, I forgot to update here... I got permission to share files, but didn't upload them. Here they are for anyone that wants to take a look!
Do you have a photo of the screen of the spreadsheet. It would help to understand the data.
It seems, the upper block are definitions of the table and on the end are the data for the cells.
Alas, no - I have no connection to the real thing. That would make it too easy - to have the box top to the puzzle!
I'll see what I can do.
this isn't that hard. the first 2600 bytes are a fixed length structure. the rest is cells and cell meta. footering all cell structures is 4 bytes that look suspiciously like a checksum.
update. 1: fairly sure the first 4 bytes after 2600 are a checksum for the preceding section, which seems to consist of basically only memory references by the looks. do we know the brother's memory map? 2: not sure what for the following four bytes are. still fumbling over "newlines" and so. leading into 3: it looks like this is all laid out left to right, top to bottom. 4: the 32 bytes following #2 look to be cell field meta. 5: cells seem to be made up of meta information (that can contain flags and some memory reference stuff) preceding actual text in cells. 6: no two "memory references" are identical, as far as I have seen.
Cells have a length prefix (1 byte) followed by data, including length byte... so you can traverse through the structure by following the lengths starting at offset 0x0a50 into a file.
SD2Dumped.zip Given what we know so far, I wrote a dumb dumper routine to help visualize the contents of a file. So we get some lines like this:
(0x07) 0x00 0x00 0x00 0x00 0x04 0x00
(0x07) 0x00 0x00 0x00 0x00 0x04 0x00
(0x08) 0x52 0x45 0x41 0x53 0x4f 0x4e 0x00 [REASON]
The leading byte in parenthesis is the length of that run, and then the hex values that are in it. If it looks like text (i.e. the first byte isn't zero), the text is dumped in brackets. I found some problems in the earlier file I sent, so I'm adding some in the zip file that are all internally consistent and vectors all add up to the end of the file.
Is it possible to get the whole image file?
Because it looks for me the data at the end of file is trash. Or cut..
From WP-1 files I know there are "end of data/document"-flag bytes. Additional bytes are only trash bytes fills the block - and will overwrite each time of save the document to disk. Maybe there are similar bytes.
So we should find out the "end of document"-flag -or- maybe noted in directory area the file length bytes.
The tables positions/tabulatores may be word-type in twips, like on WP-1 too.
I've got a Btrother 240k disk here that has spreadsheet files on it; they all seem to end in ".SD2" file suffixes. Do you have any leads on interpreting spreadsheet files? Eyeballing it, it looks like it's got a 1-byte length followed by n-1 bytes of data vector sort of format by the time you're looking at user data. Anyway, before digging too much deeper, have you seen any work on these yet?