Open yamisama opened 7 years ago
This is some really cool info, thanks!
Unfortunately, it's been two years since I last touched this project, and I don't have a lot of free time to spend working on this project again. If you or someone else wants to submit a PR, I'd be more than happy to merge it in.
Sounds great, thank you - I have to confess that I don't really have much experience with Git/Github, so I'll have to see when I can find enough time to get far enough to where I don't cause more chaos than I help clear up, but I'll definitely give it a go!
Thanks for your fantastic work!
A few minor additional little bits I found:
The field in the header you called "unknown" seems to be the file type. The value 'Prsn' apparently is a "Persona Document", 'BrAr' can curiously either be a Brushes ("Brush Archive"?) or a Macro archive, 'Swth' are swatches, and Assets (introduced in version 1.5) are 'AsAr' ("Asset Archive"?). Anything other than 'Prsn' currently triggers the assert in the Python code, though.
Documents are always named "doc.dat" internally, Brushes are "brushes.dat", Macros are "Macros.dat", swatches are "Swatches.dat", and assets are "Assets.dat". Assets apparently have loads of those little entries with slashes ("d/a3" and the like).
Also, in recent versions, the FAT seems to have a signature of '#FT2' instead of '#FAT' – not sure if the binary structure has changed, but it is likely of course.
Swatches seem to be stored uncompressed and might be an interesting candidate for finding out more about the container format since the scope of the data is simply more limited. Curiously, #Fil section for swatches seems to start with something that resembles the main file signature ("00 FF 4B 53" for swatches, "00 FF 4B 41" for the container file).
For swatches, the header field the current documentation calls "zlib_length" just seems to be the length of the data in the '#Fil' section of the file, whether it is compressed or not. However, this length excludes the '#Fil' marker and the FFFFFFFF marker at the end of the '#Fil' section. If I see this correctly, this would imply that there could theoretically only ever be one '#Fil' section, unlike the current documentation of the last header field would suggest. However, I have definitely seen multiple '#Fil' sections in some documents. So could this maybe be the cumulative size of all data files in the container, before decompression? And individual file size then as determined by the FAT?
Moreover, incremental saving of documents can apparently produce different files than doing a "Save as". The file format is apparently designed for fast incremental loads and saves, and "Save as" might somehow be consolidating data. The files with the slashes in the name might be related to that.