inwc3 / JMPQ3

Native Java mpq archive library
Apache License 2.0
38 stars 17 forks source link

Allow building mpqs without a listfile #17

Closed Frotty closed 3 years ago

Frotty commented 7 years ago

if all blocks can be identified by provided listfiles or similar. I added some log for now to tell the user that the DefaultListFile was used and rebuilding isn't supported.

DrSuperGood commented 7 years ago

Technically a MPQ without a list file is kind of malformed as the list file is required for full file system support, similar to the attribute file.

As long as offset based encryption is not used one can freely move around file sector data. As long as all blocks that are referenced by the hash table fulfil this requirement then the MPQ could be rebuilt. This would require a special program path as it would need to move around stored file sectors directly rather than the file data.

Additionally a complete list file is required to enlarge the hash table. This is only needed if significant number of files are being added such that the hash table is suffering from high utilization or even out of bucket space. One can technically expand the hash table without one by duplicating entries however this is not efficient.

The current problem is that modifying existing archives is not really supported. There are times where one might want to make a change without rebuilding the entire archive. One can do this to archives with incomplete or missing list files as you are only adding to them and not touching existing file data. Rebuilding an archive should only be done when archive size is important, eg for a release ready wc3 map file.

A good fall back currently might be new methods to rebuild an archive using all known to exist files. Any files not in the detected list file are then lost. Methods could exist to detect and add file paths to the detected list file. Kind of allowing third party list files to be used with automatic removal of non existing files. Existence of a file testing is done by checking if the file path maps to a block in the hash table.

zach-cloud commented 4 years ago

I'm wondering if this is planned to be supported. Currently, I am using JMPQ3 and have a set of files I'd like to extract and import to a map (such as war3map.j). Even though I can extract, I can't import since protected WC3 maps do not contain a (listfile). It seems (listfile) is removed during map optimization as WC3 does not need it to play.

It seems that other MPQ editors like MPQMaster or Ladik's MPQ Editor will allow the user to select an auxiliary listfile and then use that in place of the (listfile). I don't know the internal details, but that would be great if it could be done.

If possible, it'd be nice if modifying an existing archive could be supported rather than rebuilding the archive. Rebuilding an archive also runs into problems when your auxiliary listfile doesn't contain many of the files in the map.

I might take a try at coding this in the future. I have some other high priority issues to write and there are lots of other MPQ editors available. But it would be the best user experience if my application can import and export MPQ files as well, to make it all-in-one.

Frotty commented 4 years ago

No it's not, I don't really care about rebuilding "protected/corrupted" mpqs. If you want to PR the feature, be my guest.

zach-cloud commented 4 years ago

Sorry, I didn't exactly mean corrupted MPQs. That is a whole different thing. Just maps without (listfile) included in the archive, which I believe is all wc3 maps that have been optimized by Vexorian's Optimizer. I haven't found any map with a (listfile) included, even the ones that are openable with basic MPQ editors like MPQMaster.

I'm still working on understanding things. But it seems like MPQ Editors will add their own (listfile) to the archive, based on the listfile selected by the user. I'm wondering how difficult that would be to implement, with some sort of addExternalListfile method.

Anyways I am going to try it myself. But in general it seems like we should be able to edit MPQs without having to rebuild the entire thing (rebuild typically leads to file loss), but I definitely don't know enough to attempt that. I wonder if we can just rewrite the specific sections of the block table that correspond to that file and leave the rest alone.

Frotty commented 4 years ago

No, read the posts above or read up on the mpq format. All proper maps built by the world editor have a listfile. As per specification, a missing listfile is a corruption. The game does not need a listfile to play a map, because if you only read from it then you don't need to know all stored blocks. If you want to modify it, you need the listfile to identify all blocks. This is exactly the same in ladik's mpq editor, where u either have to supply a full listfile or extract what u know and rebuild the mpq. The reason jmpq always does a full rebuild on save is mostly due to legacy reasons. The project was started as read-only extractor and was then messily enhanced to do what was needed to modify mpqs as well. The current structure makes it hard to implement changes.

zach-cloud commented 4 years ago

I understand that theoretically, any MPQ missing (listfile) does not conform to the MPQ specs. However practically, nearly every published WC3 map is missing (listfile). In these cases where none of the actual data follows the original spec.. I mean you can say those are all invalid, but it's moreso the spec we're working off of being outdated/inaccurate.

Anyways I have created a pull request to resolve this issue and make JMPQ3 able to write MPQs without (listfile) as long as consumer code links their own listfile.

And yes it does result in data loss if the listfile specified doesn't contain all block entries but this is unavoidable in the current program setup. This is resolvable easily enough by performing static analysis on the .w3u/w3a/j/etc files to dynamically create a listfile. Something I plan on doing in the future. The best resolution would be to make JMPQ3 able to save without full rebuild but as you said, it's challenging.

https://github.com/inwc3/JMPQ3/pull/36

Frotty commented 4 years ago

I understand that theoretically, any MPQ missing (listfile) does not conform to the MPQ specs. However practically, nearly every published WC3 map is missing (listfile). In these cases where none of the actual data follows the original spec.. I mean you can say those are all invalid, but it's moreso the spec we're working off of being outdated/inaccurate.

No you are missing the point. Most uploaded maps don't contain listfiles, because they have been removed on purpose, so called "protection". The spec is completely unaffected by this. Jmpq was not made to open protected maps and therefore it's not of my concern.

This is resolvable easily enough by performing static analysis on the .w3u/w3a/j/etc files to dynamically create a listfile.

This isn't "easily resolvable". Paths don't have to be in the mpq, and even if they are, 1 concatenation in the .j file can screw you over. The only "real" way to go about this was ladik's live scanner, which intercepts all calls to storm and therefore gets access to all resource paths used by a map.

I will check the PR.

zach-cloud commented 4 years ago

Well if a file is not used, then it really doesn't matter if it's lost. Almost all files are going to be imports which can be resolved by looking at the war3map object files.