davidgiven / fluxengine

PSOC5 floppy disk imaging interface
MIT License
356 stars 69 forks source link

Brother 240k spreadsheet files? #342

Open david-schmidt opened 2 years ago

david-schmidt commented 2 years ago

I've got a Btrother 240k disk here that has spreadsheet files on it; they all seem to end in ".SD2" file suffixes. Do you have any leads on interpreting spreadsheet files? Eyeballing it, it looks like it's got a 1-byte length followed by n-1 bytes of data vector sort of format by the time you're looking at user data. Anyway, before digging too much deeper, have you seen any work on these yet?

mathe1 commented 2 years ago

Hi, I would like to look at these file format. Maybe, if these data are similar to brother's WP1-format (I made a converter), then we could try to interpret and convert....

david-schmidt commented 2 years ago

Cool, I'll see if I can get the spreadsheet program rolling on a Brother locally (the data I have isn't mine to share), so hopefully we can take a deeper look.

CheetahPixie commented 2 years ago

I could probably rip this open on short notice if you skip me some files to disect.

david-schmidt commented 1 year ago

ExampleSD2Files.zip Oh man, I forgot to update here... I got permission to share files, but didn't upload them. Here they are for anyone that wants to take a look!

mathe1 commented 1 year ago

Do you have a photo of the screen of the spreadsheet. It would help to understand the data.

It seems, the upper block are definitions of the table and on the end are the data for the cells.

david-schmidt commented 1 year ago

Alas, no - I have no connection to the real thing. That would make it too easy - to have the box top to the puzzle!

CheetahPixie commented 1 year ago

I'll see what I can do.

CheetahPixie commented 1 year ago

this isn't that hard. the first 2600 bytes are a fixed length structure. the rest is cells and cell meta. footering all cell structures is 4 bytes that look suspiciously like a checksum.

CheetahPixie commented 1 year ago

update. 1: fairly sure the first 4 bytes after 2600 are a checksum for the preceding section, which seems to consist of basically only memory references by the looks. do we know the brother's memory map? 2: not sure what for the following four bytes are. still fumbling over "newlines" and so. leading into 3: it looks like this is all laid out left to right, top to bottom. 4: the 32 bytes following #2 look to be cell field meta. 5: cells seem to be made up of meta information (that can contain flags and some memory reference stuff) preceding actual text in cells. 6: no two "memory references" are identical, as far as I have seen.

david-schmidt commented 1 year ago

Cells have a length prefix (1 byte) followed by data, including length byte... so you can traverse through the structure by following the lengths starting at offset 0x0a50 into a file.

david-schmidt commented 1 year ago

SD2Dumped.zip Given what we know so far, I wrote a dumb dumper routine to help visualize the contents of a file. So we get some lines like this:

(0x07) 0x00 0x00 0x00 0x00 0x04 0x00
(0x07) 0x00 0x00 0x00 0x00 0x04 0x00
(0x08) 0x52 0x45 0x41 0x53 0x4f 0x4e 0x00 [REASON]

The leading byte in parenthesis is the length of that run, and then the hex values that are in it. If it looks like text (i.e. the first byte isn't zero), the text is dumped in brackets. I found some problems in the earlier file I sent, so I'm adding some in the zip file that are all internally consistent and vectors all add up to the end of the file.

mathe1 commented 1 year ago

Is it possible to get the whole image file?

Because it looks for me the data at the end of file is trash. Or cut..

From WP-1 files I know there are "end of data/document"-flag bytes. Additional bytes are only trash bytes fills the block - and will overwrite each time of save the document to disk. Maybe there are similar bytes.

So we should find out the "end of document"-flag -or- maybe noted in directory area the file length bytes.

The tables positions/tabulatores may be word-type in twips, like on WP-1 too.