loicfrance / mahoyo_tools

extract and reinject files into Witch of the Holy Night (Steam)
6 stars 0 forks source link

.mrg/.nam/.hed archives #2

Open Dimoks opened 3 months ago

Dimoks commented 3 months ago

Can you add work with .mrg/.nam/.hed archives? You can also look here.

loicfrance commented 3 months ago

I can try to add support for these file types, but I will have trouble testing as I don't have the original files to test with. I will see what I can do next week. If anyone wants to implement this, feel free to open a PR.

Dimoks commented 3 months ago

Here examples.

theAeon commented 3 months ago

Currently messing with a (slightly modified w/ chinese indexes changed to english for fairly obvious reasons) https://github.com/Cecropis/Mikadzuki and am able to get it to spit out the tsukihime-remake-style content json from the script_text.mrg file. At that point the issue is more one of conversion as the ctd scripts aren't quite the same.

Dimoks commented 3 months ago

script_text.mrg is mzp archive.

theAeon commented 3 months ago

I'm aware. For text only replacement, it seems to be enough to pull the ctd file using a mfa extractor (whether this or https://github.com/rschlaikjer/mangetsu/), pull the script from switch using the script_text_to_content_json from Cecropis/Mikadzuki, use a python script i just hacked together at https://gist.github.com/theAeon/891339de689578f48a4dc96cf39e5724 to insert the ctd file into the NewText field, and then use mikadzuki's repack to get the mrg back.

As for the other archives, well, no clue.

Dimoks commented 3 months ago

In script_text.mrg 5 txt files with table of offsets of start of each line. Offset of last line are duplicate, and table ends with 4 0xff bytes. The order of the files in script_text.mrg is as follows: table, txt. A small piece from the deepLuna code:

        for offset in range(max_offset + 1):
            # There are a handful of null strings that mark EOF
            # Just write FFs to the offset table, no string data for these
            if not offset_to_string.get(offset, ''):
                continue

            # Write the offset of this string to the offset table
            offset_table.write(struct.pack(">I", string_table.tell()))

            # Write the string data to the string table
            string_table.write(
                offset_to_string.get(offset, '').encode('utf-8'))

        # Finalize the offset table by writing the final offset twice,
        # followed by 12 bytes of 0xFF
        offset_table.write(struct.pack(">I", string_table.tell()))
        offset_table.write(struct.pack(">I", string_table.tell()))
        offset_table.write(struct.pack(">I", 0xFFFFFFFF))

I've tweaked deepLuna itself, but I haven't forked it and I'm not sure it's worth doing. I can just upload it as an archive if needed.

Dimoks commented 3 months ago

I made a fork after all.

Dimoks commented 2 months ago

I didn’t notice that one important class was broken when moving changes from the test folder to the repository folder... I fixed it.

loicfrance commented 2 months ago

Thanks a lot for following on this issue. I haven't been able to work on this for the past weeks, but I should be able to get on it this week. If I understand correctly, I can look at mangetsu and/or your fork of deepluna to get knowledge of how .mrg/.nam/.hed file work and implement the unpack/repack algorithms. Please tell me if I get anything wrong. Otherwise, I'll start working on it soon.

Dimoks commented 2 months ago

deepLuna works only with mzp archives. So you need to look only at the tools mentioned in the first post.

Dimoks commented 2 months ago

In readHED lost file.seek(0). In readNAM you need to discard 0x00 completely. For example like this:

        for _ in range(0, nb_entries) :
            buffer = file.read(NAM_ENTRY_LEN)
            if buffer[0] == 0x00:
                break
            name = buffer.decode('utf-8').rstrip('\r\n\x00')
            fileNames.append(name)
Dimoks commented 2 months ago

Can you add the creation of new mrg and mzp archives from specified files?

loicfrance commented 2 months ago

In readHED lost file.seek(0). In readNAM you need to discard 0x00 completely.

Thanks for the feedback. I haven't tested it yet, it might still be full of bugs.

Can you add the creation of new mrg and mzp archives from specified files?

Seems feasible. I haven't implement it at first (when working on the PC release), as I did not see any use case for it. Is there really a need to add new assets and not just replace existing ones ?

Dimoks commented 2 months ago

Sometimes it is necessary. It would be nice to pick up NXX from mangetsu as well. All these formats are used in the switch version. And mangetsu is built for Linux...

Dimoks commented 2 months ago

When inserting files, mrg is reset to zero. with MrgArchive(archive, 'w') as arc:

loicfrance commented 2 months ago

I implemented the first stage of nxx decompression (gzip / deflate), and added the support in the extract method of MrgEntry. I still have to implement BTNX decompression.

When inserting files, mrg is reset to zero.

Sorry about that. I haven't implemented the reconstruction of the .mrg file yet, so please avoid this for now. I should be able to fix this soon, I'll keep you updated.

Dimoks commented 2 months ago

nxx сompression is also needed. There is a Python utility for bntx (my fork).

Dimoks commented 1 month ago

mrg_info, mrg_pack and mrg_replace commands from mangestu are very necessary.

Dimoks commented 1 month ago

It was possible to make mrg_info from the parameters available in the library.

def mrg_info(archive):

    if archive[0] == "csv":

        with MrgArchive(archive[1]) as arc:
            for i, entry in enumerate(arc):
                print(f"{i},{entry._offset_sectors:#010x},{entry._size_comp_sectors:#010x},"
                    f"{entry._size_decomp_sectors:#010x},"
                    f"{entry._name if not entry._name==None else ""}")

    else:

        with MrgArchive(archive[0]) as arc:
            for i, entry in enumerate(arc):
                print(f"Entry {i:>8}: Offset {entry._offset_sectors:#010x}, "
                    f"Size {entry._size_comp_sectors:#010x} sectors, "
                    f"Uncompressed size {entry._size_decomp_sectors:#010x} sectors"
                    f"{f", Name: {entry._name}" if not entry._name==None else ""}")