saramibreak / DiscImageCreator

This is the disc (CD, GD, DVD, HD-DVD, BD, GC/Wii, XBOX, XBOX 360) and disk (Floppy, MO, USB etc) image creation tool
http://forum.redump.org/topic/10483/discimagecreator/
Apache License 2.0
513 stars 45 forks source link

Bug - DVD mode - /raw mode error handling issues (with resuming) #206

Closed ehw closed 1 year ago

ehw commented 1 year ago

Version 20230504T032226 (from #202 )

Describe the bug DiscImageCreator seems to have an odd issue with resuming raw dumps that have errors. I'm unsure if this is partly due to an oversight or circumstances with dumping with raw mode in certain drives.

Here are some bugs that I've found 1.) When an error occurs on the drive, the drive will return 00s for the sector and these bytes get added to the .raw file. However, when you go to resume or descramble the dump, DIC fails to do both because the sector's data doesn't exist which can cause issues. When you resume, for instance, DIC will use the .raw file to find out where to seek to start resuming. However, if one of the sectors gets blown out and replaced with 00s, the drive can't seek properly because it sets the LBA/PSN to 0 and will return more null data by the drive. As a result, the .raw file gets completely overwritten with 00s and will indefinitely grow to a large size without stopping.

This is what happens in the command prompt window when you try to resume a raw dump that experienced read errors that were substituted with 00s:

Start resuming -> Last sector num: 1

Expected sector num: d1, Got sector num: 202e30 [0/16][0/1] Reread 1. LBA: 1912371 Creating raw(LBA) 1912387/1913936 Expected sector num: e1, Got sector num: 202e50 [0/16][0/1] Reread 2. LBA: 1912403 Creating raw(LBA) 1912419/1913936 Expected sector num: f1, Got sector num: 202e70 [0/16][0/1] Reread 3. LBA: 1912435 Creating raw(LBA) 1912451/1913936 Expected sector num: 101, Got sector num: 202e90 [0/16][0/1] Reread 4. LBA: 1912467 Creating raw(LBA) 1912483/1913936 Expected sector num: 111, Got sector num: 202eb0 [0/16][0/1] Reread 5. LBA: 1912499 Creating raw(LBA) 1912515/1913936 Expected sector num: 121, Got sector num: 202ed0 [0/16][0/1] Reread 6. LBA: 1912531 Creating raw(LBA) 1912547/1913936 Expected sector num: 131, Got sector num: 202ef0 [0/16][0/1] Reread 7. LBA: 1912563 Creating raw(LBA) 1912579/1913936 Expected sector num: 141, Got sector num: 202f10 [0/16][0/1] Reread 8. LBA: 1912595 Creating raw(LBA) 1912611/1913936 Expected sector num: 151, Got sector num: 202f30 [0/16][0/1] Reread 9. LBA: 1912627 Creating raw(LBA) 1912643/1913936 Expected sector num: 161, Got sector num: 202f50 [0/16][0/1] Reread 10. LBA: 1912659 Creating raw(LBA) 1912675/1913936 Expected sector num: 161, Got sector num: 202f60 [0/16][0/1] Reread 11. LBA: 1912675 Creating raw(LBA) 1912691/1913936 Expected sector num: 161, Got sector num: 202f70 [0/16][0/1] Reread 12. LBA: 1912691 Creating raw(LBA) 1912707/1913936 Expected sector num: 161, Got sector num: 202f80 [0/16][0/1] Reread 13. LBA: 1912707 Creating raw(LBA) 1912723/1913936 Expected sector num: 161, Got sector num: 202f90 [0/16][0/1] Reread 14. LBA: 1912723 Creating raw(LBA) 1912739/1913936 Expected sector num: 161, Got sector num: 202fa0 [0/16][0/1] Reread 15. LBA: 1912739 Creating raw(LBA) 1912755/1913936 Expected sector num: 161, Got sector num: 202fb0 [0/16][0/1] Reread 16. LBA: 1912755 Creating raw(LBA) 1912771/1913936

If you dump the entire disc with errors in one go without stopping and resuming, it does this when it gets to the descrambling part:

D:\Sazpaimon\Downloads\DiscImageCreator_test\Release_ANSI\DiscImageCreator.exe dvd E "D:\Sazpaimon\Dumps\2023\SHADOW XBOX BETA 4\Dump 21\SHADOW XBOX BETA 4" 24 /raw AppVersion 32 bit, AnsiBuild, 20230504T032226 valid extension was omitted. -> set .raw CurrentDirectory D:\Sazpaimon\Dumps\2023\SHADOW XBOX BETA 4 WorkingPath Argument: D:\Sazpaimon\Dumps\2023\SHADOW XBOX BETA 4\Dump 21\SHADOW XBOX BETA 4.raw FullPath: D:\Sazpaimon\Dumps\2023\SHADOW XBOX BETA 4\Dump 21\SHADOW XBOX BETA 4.raw Drive: D: Directory: \Sazpaimon\Dumps\2023\SHADOW XBOX BETA 4\Dump 21\ Filename: SHADOW XBOX BETA 4 Extension: .raw StartTime: 2023-05-04T01:49:10-0400 Set the drive speed: 0KB/sec DiskSize of [D:\Sazpaimon\Dumps\2023\SHADOW XBOX BETA 4\Dump 21] Total: 14000516493312 bytes Used: 11889064329216 bytes

    Space:  2111452164096 bytes
     => There is enough disk space for dumping

Rawdump command [0]:0x3c [1]:0x01 [2]:0x01 Creating raw(LBA) 1913920/1913936 GOD/WOD unscrambler 0.4.1 (xt5@ingenieria-inversa.cl)

This program is distributed under GPL license, see the LICENSE file for more info.

caching seed 0080 caching seed 0005 caching seed 0100 caching seed 000a caching seed 0200 caching seed 0014 caching seed 0400 caching seed 0028 caching seed 0801 caching seed 0050 caching seed 1002 caching seed 00a0 caching seed 2004 caching seed 0140 caching seed 4008 caching seed 0280 caching seed 0000 error: bad edc (a8ef4f0d) must be 82def389 error unscrambling recording frame 116013. time elapsed: 68.00 seconds. ret = 116013 Hashing: SHADOW XBOX BETA 4.iso EndTime: 2023-05-04T05:09:06-0400

I'm not entirely sure what to do about this. I thought the drive should be returning everything read even if there was an error, but it seems that some drives won't bother putting any data on the buffer/cache if there was an error, so the drive returns 00 and DIC will just add the null bytes to the .raw file? I'm not sure if this is entirely the drive's fault or if maybe DIC might not be properly including data even on reads with errors.

But regardless, I think DIC should be able to identify the sectors that are either filled or have mismatched EDC and start rereading/resuming from there instead. So if I have a disc that experienced an error while raw dumping, if I use /re, DIC should first retry those bad sectors first before resuming where it left off. Otherwise, the descrambling will never work. Can this be added? Descrambling, in my opinion, should only be done when DIC itself knows that all the EDC will pass.

2.) When the drive encounters an error in /raw mode, the entire block does not appear in the rawReadable log causing a skip in the log. The log should include every sector, or at least mark that the sector/block was bad, no?

LBA[1856207, 0x1c52cf]: SectorInfo[00]{Layer 0, read-only, Data Zone, greater than 40%}, SectorNumber[1f52cf ( 2052815)], IED[4ac8], CPR_MAI[000000000000], EDC[62a0f0bf] LBA[1856224, 0x1c52e0]: SectorInfo[00]{Layer 0, read-only, Data Zone, greater than 40%}, SectorNumber[1f52e0 ( 2052832)], IED[3b96], CPR_MAI[000000000000], EDC[6d639f88] LBA[1856225, 0x1c52e1]: SectorInfo[00]{Layer 0, read-only, Data Zone, greater than 40%}, SectorNumber[1f52e1 ( 2052833)], IED[3894], CPR_MAI[000000000000], EDC[8e56f255]

I know that errors will propagate in the mainError log now but these gaps in the rawReadable log are hard to find as they are when troubleshooting.

3.) Sometimes even though DIC's rawReadable log doesn't mention a sector was read within a block that experienced an error, sector data for something within that 'skipped' 16 sector block can still exist in the .raw file even though there's no mention of it in rawReadble. For example:

image The above screenshot I believe is data for LBA 1856221 (because $107C3B100 divided by 2384). However, according to rawReadable, it seems that 1856221 was never read. However, there exists EDC data (82DEF389 is the value, located at $107C3B28A in the raw).

So it seems that DIC can read sectors within a 16-sector block that's been reported bad, but some of the sectors are still read within that bad block and are added to the .raw. I don't know if this means that the sectors that returned data are good within the 'skipped' block, but it's just something that's not really mentioned anywhere.

Log file Here are the log files that describe all the issues listed above. The log was generated by using a Lite-On drive with 2384 raw reading capabilities. A disc with errors was read from start to finish without stopping or resuming. When the dump was finished, DIC tries to unscramble the raw file but failed giving the error message seen above. https://file.io/QrXhGblPCdFI

saramibreak commented 1 year ago

When an error occurs on the drive, the drive will return 00s for the sector and these bytes get added to the .raw file.

DiscImageCreator_test.zip

ehw commented 1 year ago

When an error occurs on the drive, the drive will return 00s for the sector and these bytes get added to the .raw file.

DiscImageCreator_test.zip

* changed: Non 0xE7 drive is thrown, not write 00 bytes.

Tested. I'll break everything down by bug:

Bug 1.) So when using non 0xE7 drives like the ASUS/Lite-On drives, DIC just quits dumping when it hits an error now? So there really is no other data on the cache/buffer when reads fail? :( I was hoping that the drives would keep even the rawish data from a failed read on the cache/buffer so DIC could read it but I guess that's not the case? Are you saying 0xE7 drives can keep failed read data in memory?

When we tried dumping the bad DVD-R again, DIC stopped dumping when it encountered this error:

LBA[1848320, 0x1c3400]: [F:ReadDVDRaw][L:914] Opcode: 0xa8 ScsiStatus: 0x02 = CHECK_CONDITION SenseData Key-Asc-Ascq: 03-11-05 = MEDIUM_ERROR - L-EC UNCORRECTABLE ERROR

We resumed the dump and it seemed to seek back to this LBA again, but it kept returning the same error and quitting. Is this the intended behavior now? We did check the .raw before and after resuming with /re and it seems that the .raw file is fine and isn't being modified with 00 bytes.

The one thing we did notice is that /rr doesn't work with /raw mode. Is this intentional? It makes dumping bad discs in /raw mode harder because we have to constantly /re back to where it left off just to retry once. Can you make /rr work with /raw so we can set retry counts? If /rr support is added for /raw, should it be required for drives that read data off the cache to clear it before each retry? Just in case the same data accidentally gets returned by the drive?

Bug 2.) Because DIC is no longer processing the dump after failed reads, this second bug is no longer an issue. Bug 3.) Because DIC is no longer processing the dump after failed reads, this third bug is no longer an issue.

So ultimately I think DIC needs to get /rr working for /raw mode and I think this will be okay...

saramibreak commented 1 year ago

DiscImageCreator_test.zip

ehw commented 1 year ago

DiscImageCreator_test.zip

  • re-changed: Only 0xE7 drives write 00 bytes when scsi errors occur.

I don't have a 0xE7 drive to test on hand, but why would you write the 00 bytes to the raw file when SCSI errors occur for these drives but not any other drives? Do the 0xE7 drives return more data when errors occur or something?

I think what's needed is support for /rr for raw reads, as it doesn't seem to be working. I think it'd be best for DIC to stop dumping raw if the drive returns 00 for any drive providing it's a drive that's supposed to be returning scrambled sectors. DIC stops when doing READ12 dumps if SCSI errors occur, so maybe it should be doing the same thing for /raws? We just need /rr to work for /raw.

saramibreak commented 1 year ago

Would you try the latest test version? I want to know if ASUS can get the cache when SCSI error occurs like plextor.

We just need /rr to work for /raw.

It's another issue.

ehw commented 1 year ago

Would you try the latest test version? I want to know if ASUS can get the cache when SCSI error occurs like plextor.

We just need /rr to work for /raw.

It's another issue.

I'll test. :)

Can you merge the fixes for #204 and #205 and prepare a new test version with the changes for this issue?

saramibreak commented 1 year ago

Can you merge the fixes for https://github.com/saramibreak/DiscImageCreator/issues/204 and https://github.com/saramibreak/DiscImageCreator/issues/205 and prepare a new test version with the changes for this issue?

Yes. Latest is here. https://github.com/saramibreak/DiscImageCreator/issues/204#issuecomment-1537449246

ehw commented 1 year ago

Can you merge the fixes for #204 and #205 and prepare a new test version with the changes for this issue?

Yes. Latest is here. #204 (comment)

Okay, I tested this version out by doing the following.

I took a PlayStation 2 magazine demo disc that I have (http://redump.org/disc/14073/) that was pretty scratched up and dumped it with the ASUS 3.02 firmware with /raw. The dump went perfectly with no errors during and after unscrambling. So what I did was I took a piece of clear tape and put a tiny pit on the surface of the disc and ran /raw again on a new dump, and this is what happened:

1.) The drive first encountered an error at LBA 165808. The drive returned a TIMEOUT ON LOGICAL UNIT. It did this 5 times, but then seemed to start reading again starting at 165888. 2.) However, because DIC still expects sector number 587b0, none of the reads at this point are being added to the .raw 3.) DIC has a hard limit of 40 retries before the dump process quits. 4.) After DIC stopped dumping, I opened up the .raw and I noticed that the last LBA that was recorded with data I believe was 165807. None of the other sectors were ever recorded, not even the ones that had TIMEOUT ON LOGICAL UNIT errors. There are no 00 bytes either. 5.) I used /re to resume the dump to see what would happen. It still couldn't read 165808, so the same thing happened as I said before. 6.) While I didn't record this, I did use /re a few times and it was able to read a few more sectors after 165808. However, as soon as the drive returned an error, DIC still expects to read the sector that first returned the error and so never records any data in the .raw file.

So in a nut shell, as of right now DIC doesn't provide any additional data from the cache even on failed attempts. As soon as an error occurs, DIC pretty much stops accepting new data from the drive until you stop the dump and resume it again.

Here is the log file. I included the .raw this time for you to take a look at.

https://www.mediafire.com/file/f2s8usofby2k40w/bug+-+206.7z/file

Hope this helps!

saramibreak commented 1 year ago

It seems that ASUS doesn't cache when scsi error occurred.

DiscImageCreator_test.zip How about plextor? I want to know If plextor also doesn't cache when dvd is dumped. Try it by this test version.

ehw commented 1 year ago

It seems that ASUS doesn't cache when scsi error occurred.

DiscImageCreator_test.zip How about plextor? I want to know If plextor also doesn't cache when dvd is dumped. Try it by this test version.

I think the same thing happens on Plextor but not entirely sure?

1.) The drive first encountered an error at LBA 165728. The drive returned a L-EC UNCORRECTABLE ERROR. However, it looks like it read a few sectors after 165728 as it calculated and read EDC checksums for 165728, 165731, 165732, and 165741.. 2.) However, once DIC tries to read the sectors after that, it seems to always expect 804890 but reads something else. After 40 retries, the dump stopped. An .iso file was created as well. 4.) After DIC stopped dumping, I opened up the .raw and I noticed that the last LBA that was recorded with data I believe was 165728. None of the other sectors after this one were ever recorded, not even the ones that had reported mismatched EDC. There are no 00 bytes either. 5.) I used /re to resume the dump to see what would happen. It could read a few more sectors, appending them to the .raw file. However, it replaced all the bytes in the .iso file with whatever sectors it began to read after resuming. When errors occurred, the same thing happened, an L-EC UNCORRECTABLE ERROR occurs followed by a few mismatched EDCs, but then gets stuck on reading all the blocks after that because it's expecting a certain sector to be returned. After 40 retries, DIC quits,

I added all the files that were generated plus the log files for both the initial attempt and retry in separate files. I did not include the refinements themselves that were made after running retry.

Here are the files: https://www.mediafire.com/file/2vndzii1gp0bx8t/bug+-+206+plextor.7z/file

Hope this helps!

saramibreak commented 1 year ago

DiscImageCreator_test.zip

                else if (!(pDevice->byPlxtrDrive && bNintendoDisc)) {
                    nLBA -= transferAndMemSize;
                    nRereadNum++;
                    if (nRereadNum == 40) {
                        break;
                    }
                    else {
                        continue;
                    }
                }
ehw commented 1 year ago

DiscImageCreator_test.zip

  • added: reread code (NOT TEST)
              else if (!(pDevice->byPlxtrDrive && bNintendoDisc)) {
                  nLBA -= transferAndMemSize;
                  nRereadNum++;
                  if (nRereadNum == 40) {
                      break;
                  }
                  else {
                      continue;
                  }
              }

I wasn't sure what to test with this one, but I just started a new raw dump with the Plextor using the same commands as I've always done and got this:

https://www.mediafire.com/file/0vvyg2b6kfp310y/bug+-+206+plextor+2.7z/file

It looks like instead of counting the total amount of errors that occurred throughout the entire dump it only does the reread count of 40 per sector instead? There weren't any messages for when a block/sector were dumped successfully though, so I didn't immediately notice it successfully retrying to read sectors...

saramibreak commented 1 year ago

It seems that rereading sector is correct. e.g. 166032 reread 7 times and matched the EDC.

LBA[166032, 0x28890]: Track[00]: CalcEDC: c62af5b1, SectorEDC: c62af5b1

I'll add the command-line argument for the reread value like the following. DiscImageCreator.exe dvd f xxx 0 /raw /rr 1000

saramibreak commented 1 year ago

We just need /rr to work for /raw.

DiscImageCreator_test.zip

ehw commented 1 year ago

We just need /rr to work for /raw.

DiscImageCreator_test.zip

  • added: support /rr for raw dumping

Tested, seems to work fine especially with resuming! I tested a dump using /rr 50 and it retries each failed sector 50 times, and without /rr it only tried reading the sector once. Upon total failure, it ends the dump.

So I take it that the buffer on a Plextor doesn't return the data from the buffer on failed sectors? I guess that's that then.... :(

Log here: https://www.mediafire.com/file/zxjacdm3ptwnfbt/bug+206+-+plextor+3.7z/file

There are a few nitpicks I noticed though:

1.) In both the command prompt and mainError log, the retries are noted but not when they fail/pass. DIC, at least for normal READ12 dumps, used to print "LBA xxxxx is retry OK" when the retry returns a good sector, but from the log in the file I provided above it doesn't look like it prints that message anymore (example: in mainError, LBA 166320 seems to have been read correctly but the log file doesn't indicate that clearly). Can you have DIC print a message underneath each retry attempt whether or not the retry "passed" or "failed"? Otherwise, the log makes it seem like someone either quit the dump halfway reading the disc or the dump moved on from the sector.

2.) DIC doesn't honor /ps # in /raw mode. I know this is probably because before it was assumed data would just be returned from the drive's cache or buffer but now that DIC runs /raw mode pretty similarly to normal READ12 dumps I suppose it might be of some people's interests to by pass when rereads fail and move on to dumping the rest of the disc that might be good. This means giving the user the ability to substitute completely failed sectors reads with a substitute byte. One thing though, I don't believe DIC records when it has to fill bytes for the sector in a log file anywhere. If /ps is used, can DIC report that upon failing to read the LBA that "LBA xxxx replaced with x" after the message that says the LBA failed?

3.) Reading the DVD in reverse with /r doesn't work, at least in /raw mode. Trying to use /r just returns the following

H:\Tools\DiscImageCreator\test2>DiscImageCreator.exe dvd D "PlayStation Magazine - Issue 53 (USA) (RAW) (TAPE)" 24 /raw /re /r AppVersion 32 bit, AnsiBuild, 20230509T211603

There might be a scenario where people want to try /raw dumps for discs with copy protection that is best read with /r. Is this possible to add?

4.) When DIC fails to read or retry a LBA in /raw mode and begins to quit, it plays the sound tone when the dump was completed successfully, not when the dump completes with failure. I think this is because DIC outputs an .iso file successfully, so it plays the sound for that process instead. DIC should always play the fail sound tone in this instance because the dump wasn't able to be dumped successfully.

saramibreak commented 1 year ago

So I take it that the buffer on a Plextor doesn't return the data from the buffer on failed sectors?

If plextor can cache the data from the buffer on failed sectors, it's probably corrupted.

DiscImageCreator_test.zip 1.) Added, but not test 2.) Not yet 3.) Not yet 4.) Fixed, but not test

ehw commented 1 year ago

So I take it that the buffer on a Plextor doesn't return the data from the buffer on failed sectors?

If plextor can cache the data from the buffer on failed sectors, it's probably corrupted.

DiscImageCreator_test.zip 1.) Added, but not test 2.) Not yet 3.) Not yet 4.) Fixed, but not test

1 & 4 are working. Thanks! Let me know if you implement something for the other two.

Let me know if you need anything else testing. Are there any raw read methods besides whatever the Lite-On/ASUS/Plextor uses that DIC supports that dump raw? I don't have any 0xE7 drives to test however...