rejetto / hfs2

web based file server
https://rejetto.com/hfs
GNU General Public License v3.0
601 stars 131 forks source link

could add a wiki for how the struct of vfs file works? #27

Open NewUserHa opened 2 years ago

NewUserHa commented 2 years ago

wanted to update the vfs file via python and other languages for convenience.

I tried blow with python.

for _ in range(900):
    try:
        print(zlib.decompress(a[_:].decode('utf8')+b'\0'))
    except:
        pass

but failed to parse the compressed vfs file. It seems that the data after the header part within the .vfs file is not compressed by zlib.

is it able to add wiki for how the struct of vfs file works?

rejetto commented 2 years ago

the file format is unfortunately not documented, and I don't have the time to do it the right way at the moment, but I'll give you a rough description here. It's a TLV, type-length-value: blocks of information are packed with a 4 bytes (dword) identifying the nature of the block, another dword for the length of the payload, and then the payload. After a block you can have another block (sequence). A payload of a block can contain a sequence of blocks (nested).

In main.pas you'll find constants for the "type" part, all starting with FK_. The zlib you were looking for is named FK_COMPRESSED_ZLIB (24), but you may want to give a look at others too.

NewUserHa commented 2 years ago

how can a payload block nested another block? if the length dword of parent payload block include the length of the child payload, how to make sure the real data of the parent block doesn't conflict with the type part(say 0-255)?

rejetto commented 2 years ago

i don't see any problem in nesting, and didn't meet any problem in making it. The parent block knows nothing of the content (inner block) so its length will be the whole (including all children).

NewUserHa commented 2 years ago

say payload A is 24 04 01 24 03 04, payload B is 24 03 01 02. the nested should is: 24 08 01 24 03 (24 03 01 02) 04. it may intercept the first 24 as a header and then cause the error?

rejetto commented 2 years ago

your example scenario makes no sense: your block cannot contain both an arbitrary payload AND another block. One or the other. Block A will contain Block B (and maybe also block C just after), or arbitrary bytes.

rejetto commented 2 years ago

Consider it a tree, where arbitrary bytes are the leaves. https://www.google.com/search?q=tlv+file+format

NewUserHa commented 2 years ago

I mean doesn't the nested blocks need a delimiter in the bytes stream for safe?

rejetto commented 2 years ago

nope, this doesn't work with delimiters, you declare length in advance. I didn't invent this, you'll find plenty of information googling.

NewUserHa commented 2 years ago

ok.