libretro / flycast

Flycast is a multiplatform Sega Dreamcast emulator. NOTE: No longer actively developed, use upstream repo for libretro from now on - https://github.com/flyinghead/flycast
http://reicast.com
GNU General Public License v2.0
155 stars 77 forks source link

Feature request: support for zipped "gdi+bin" files #421

Open barbudreadmon opened 5 years ago

barbudreadmon commented 5 years ago

Original topic on the forum : https://forums.libretro.com/t/reicast-zip-support/19200

Those dumps are the equivalent of cue+bin+wav dump for other system. This is the official format in redump releases (the optical media equivalent of No-Intro).

Might be interesting to mention that there might be some usable code at https://github.com/libretro-mirrors/Kronos/blob/extui-align/yabause/src/cdbase.c , since the Kronos core is able to do something similar (reading zipped cue+bin+wav in this specific case).

Papermanzero commented 5 years ago

+1 for this feature

blisstik commented 5 years ago

@barbudreadmon - this has been added, can you test and close as necessary?

barbudreadmon commented 5 years ago

this has been added

No it wasn't

blisstik commented 5 years ago

I think zip or 7z support has been added for BIN files. For GDI+BIN, there's CHD.

barbudreadmon commented 5 years ago

For GDI+BIN, there's CHD.

Purpose of this request is to not force users to convert their gdi+bin sets to chd to save space. 90% of people don't want to convert them because it needs a bit of reading and technical knowledge. The whole thing of saying "convert your gdi+bin to chd" is totally against the global libretro policy which is to leverage user experience.

ghost commented 5 years ago

Then there needs to be tools to make it easier to convert images to CHD, like *.bat scripts for windows users.

CHD was designed for easy streaming of compressed CD sectors. ZIP/RAR/7z container formats are not, especially when RAR/7z archives are compressed using solid mode.

Papermanzero commented 5 years ago

The question which is still not answered is, if chd is 100% loseless. Means if I create a gdi out of a chd file, is it equal to the original dump. Is it even possible to return to the original files?

If this is the case, then chd can be used as compressed archive format. So in this scenario an easy UI based tool to manage chds is missing.

barbudreadmon commented 5 years ago

Then there needs to be tools to make it easier to convert images to CHD, like *.bat scripts for windows users.

It will still need a bit of learning though, and some people just aren't interested in converting their reference iso sets to CHD.

The question which is still not answered is, if chd is 100% loseless. Means if I create a gdi out of a chd file, is it equal to the original dump. Is it even possible to return to the original files?

I think mame's chdman is able to do that, however i'm not sure crc will be a 1:1 mirror of what they were originally, which is one of the reasons for some people not being interested in converting their reference iso sets to CHD (can't manage their library with clrmamepro anymore).

Papermanzero commented 5 years ago

And this is the reason why many persons including myself do not want to switch to CHDs. If CHD cannot ensure a 1:1 dump, then it is not relevant for my archives.

p1pkin commented 5 years ago

@Papermanzero

if chd is 100% loseless

yes and no, actual image data will remain 1:1 the same, but track files might/will be named differently after extraction -> .gdi file will be no exact same as original (it contains track file names).

barbudreadmon commented 5 years ago

@Papermanzero At the very least i can confirm there is no simple way to convert chd to cue+bin+wav with chdman, it seems it is only converting back to cue+bin. I also heard chdman is removing "garbage" from the bin file when converting to chd, it might be impossible to get it back ? While chd is a convenient way to save space, it doesn't seem preservation friendly to me.

i30817 commented 5 years ago

@Papermanzero At the very least i can confirm there is no simple way to convert chd to cue+bin+wav with chdman, it seems it is only converting back to cue+bin. I also heard chdman is removing "garbage" from the bin file when converting to chd, it might be impossible to get it back ? While chd is a convenient way to save space, it doesn't seem preservation friendly to me.

You heard misinformation then. For gdi files it extracts exactly the same bytes+gdi if you bother to output as a gdi. For files that were originally a cue/bin, it extracts them 'glued' into a single bin file + toc, unlike the redump habit. But the bin bytes are the same. And in both redump and tosec you can download the cues/gdi without redownloading the game if you want them back (they're not copyrighted, but part of the dumping program).

CHD is a faaaaaar superior format to all those zip dumps, simply because it has a internal single unique checkum and metadata attachment capabilities, not to mention the child-parent softpatching (that no one but mame implements) and the support for hard-drive disk images (through copy on write) that is poised to revolutionize dumps of older computer games to be ready to play without ruining checksums.

Do i want chd to output redump style 'divided dumps'? Ofc, especially since that is needed to apply redump targeted hack patches to chds, but that's just a question of having a switch (or remembering on the header) that the image is to be extracted with one file per track.

Every dumping project needs to start recording the unique chd 'data sha1' checksums post-haste to bring some sanity to the mess of platform specific code on the application end. 'Apply buggy heuristic to figure out which platform and cd image format this cd image is from before applying parser to figure out serial or before figuring out which track to checksum' is not a recipe for good code.

Speaking of errors, i find it hilarious that people are complaining about 'false positives' on CHD compared to dumping projects that created them on the first place. Redump by dividing the game tracks and thus creating the possibility of using the 'wrong one' and both redump and tosec by not bothering to salt the cues, thus creating duplicates (redump can have cue crc32 duplicates on different platform releases with the same filename and number of tracks, TOSEC has the fail of most of their gdi files having the same contents and the same crc because they 'standardized' track filenames).

Granted, checksumming cues is the wrong approach 90% of the time (doesn't work for hacks, not resistant to malicious file modification), but a little effort would make it work for that 10%, instead of shooting themselves on the foot.

barbudreadmon commented 5 years ago

@i30817 you are totally missing the point :

  1. we are not interested in converting our reference sets
  2. it's completely against the libretro policy which is to leverage user experience, forcing people to read technical documentation to convert their dump is not leveraging their experience
i30817 commented 5 years ago

Sure, talk to me about 'libretro policy', like there was any and you read the contract.

Zip files are already supposed to be extracted (without streaming) to /tmp by retroarch when it wants to scan them or play them. That was part of the whole rant above, you're asking for something that already happens (unfortunately), and if it doesn't, it's a bug.

Also in the extremely unlikely event you're actually in a decision making capability on the redump project, adding a checksum to a dat doesn't require converting 'reference sets'. Just making a script that reads cues/bins and outputing a sha1 that is the concatenation of all tracks in order and apply it to a set to record the resulting sha1.

barbudreadmon commented 5 years ago

Sure, talk to me about 'libretro policy', like there was any and you read the contract.

With @twinaphex making a fuss on discord every time he thinks something is too complex (i understand his point, but we even had an argument when i decided to require the bios for kronos...), i can safely say that the project want to avoid the whole "ask the user to read technical documentation and use command line tools" when it is possible.

Zip files are already supposed to be extracted (without streaming) to /tmp by retroarch when it wants to scan them or play them. That was part of the whole rant above, you're asking for something that already happens (unfortunately), and if it doesn't, it's a bug.

What i'm asking is streaming, not extraction. Anyway extraction never worked in RA when there are multiple track in the zip file. So no, what i'm asking is not "already happening".

i30817 commented 5 years ago

Streaming standard zips is impossible. That's also a 'point' of that rant.

Let me explain (ugh). Compression is a process that depends on previous bytes and a dictionary. A 'normal' zip file initializes its bytes on the start of the file you want to extract, builds the final result from there, first byte to last. You can't just jump from the start to the middle without calculating stuff or from the middle to any previous position without starting again.

CHD divides the compression stream 'start points' into a finer granularity than 'start of file' and thus can simulate 'going backwards or forwards into the file' (like you want in a random access fileformat like cd images) better than zip that must start from the start of the file at the cost of slightly lower compression ratio.

BTW, rar can be even worse if you choose to create a 'solid' rar. You need to start from byte 0 of the rar all the time if you reset the decompression stream.

barbudreadmon commented 5 years ago

Streaming zips is impossible

Actually that's what is done in kronos, it's also what is done in reicast for naomi roms...

i30817 commented 5 years ago

Oh you're talking about dumping the whole file in memory? Good luck with that for dreamcast isos (which are 1.2 gb? Something like that) without hitting swap at least, which would be a shitshow. It's possible but something you should ask on the upstream retroarch project. And will probably get shot down with 'we're working on the VFS, ask later'.

And even if the main retroarch project has bugs with its unzip implementation or you want to add features like 'unzip to memory instead of filesystem' that's no reason to ask a workaround in particular cores...

barbudreadmon commented 5 years ago

Oh you're talking about dumping the whole file in memory?

To be more precise, i'm talking about "reading zip files as block". I'm not saying chd is not a more elegant way to do this, it's most likely faster and use less memory, but it's not the point of my request.

i30817 commented 5 years ago

Well i can't see the point of your request them. If you're not asking to dump them to memory or disk you end up with a situation where when you want to read bytes it's hilariously certain that you'll sometimes end up with a O(n²) operation proportional to the # of bytes in a compressed file (some game reading texture bytes backwards for example, would need to start from 0 occasionally or every time depending on implementation).

There is a reason no one thinks zip is a great format for random access and a reason why retroarch unzips games everytime even at the cost of disk lifetime. If you want to sabotage your i/o performance... well, don't expect coders to agree.

Now i'm wondering if the chd streaming implementation stores different decompresson cursors for each file. It would be useful for io that reads two files at once, like binary data and music data tracks on a redump dump, although i expect most games don't want to suicide performance by doing that on original hardware (a cdrom no less) so it might not be as much of a issue as I expected with 1 cursor...

barbudreadmon commented 5 years ago

If you're not asking to dump them to memory

Afaik, reading files as block is dumping them into memory indeed, so yeah it consumes memory, however the choice should be up to the final user. Recommending chd is valid, especially for low end hardware, however at this point it is just plain forcing if you want to save some space, i'm just asking for an alternative, so please stop being so negative about it...

i30817 commented 5 years ago

You're still asking this on the wrong place.

barbudreadmon commented 5 years ago

You're still asking this on the wrong place.

I don't think so, kronos is handling this in the core, and reicast already got the facilities to do that (thanks to the mame roms reading implementation).

i30817 commented 5 years ago

You are aware that NAOMI and NAOMI GD-ROM are different things for mame right? Mame doesn't compress cd images except on chd as far as i know, so telling people that the 'mame roms reading implementation' on reicast is reading zipped NAOMI roms might very well mean nothing to 'reading zipped gdi' if the NAOMI roms mentioned are of the sort that isn't a gd-rom.

This is the list of naomi gd-rom games: https://segaretro.org/Sega_NAOMI#NAOMI_GD-ROM

If you can find one of those in your collection that reicast upstream reads 'zipped', you can mention it. Be aware that some games have two versions, gd-rom and not, but it should be relatively easy to check by opening the zip of the rom you think could be a cdimage. If it has more than one file in the zip, it isn't (but if it has 1 file i don't know if it's a cd image, though as mentioned i believe it's mame policy to put cd images unzipped in subdirs as chd).

barbudreadmon commented 5 years ago

I'm not talking about gdrom, just plain "zip reading as block", in kronos those facilities are common between reading zipped cue+bin+wav and reading mame stv roms.

i30817 commented 5 years ago

sigh That implementation is dumping the whole zip uncompressed to memory. It's something you should ask in the main retroarch project because it's what handles zip decompression in retroarch - duplicating code is not cool. Besides, reicast has nothing to do with kronos.

https://github.com/libretro-mirrors/Kronos/commit/8eec52d8b9ca199c52aaf6c1c780a3619336cc97

Add support of zipped rom loading. The zip should contain a valid cue file and all the binary files inside, like redump zip files. It is consuming a lot of RAM since game shall be deflated in RAM. if it is failling, then just deflate the game and launch from the deflated directory

barbudreadmon commented 5 years ago

That implementation is dumping the whole zip uncompressed to memory

That's just an issue if you don't have the memory (which is unlikely on computer meeting the current requirements), and you still have the possibility to unzip your iso anyway. Feel free to provide a PR for chd support, i'm all for alternatives as long as they don't break vanilla format support.

It's something you should ask in the main retroarch project because it's what handles zip decompression in retroarch - duplicating code is not cool.

As long as this kind of stuff is declared : https://github.com/libretro/reicast-emulator/blob/844e83e4cf24c9a8915286331895b588d1d51a55/core/libretro/libretro.cpp#L1940 Unzipping is not RA's job anymore (same code in kronos).

Besides, reicast has nothing to do with kronos.

That's not the point, however actually they have a lot in common.

Rex000 commented 3 months ago

+1 from my side too