iscsi-osx / iSCSIInitiator

iSCSI Initiator for macOS
BSD 2-Clause "Simplified" License
552 stars 97 forks source link

Implement advanced error recovery #3

Open nsinenian opened 9 years ago

nsinenian commented 9 years ago

Hooks have been left to perform advanced error recovery. Instances where the iSCSI error recovery level should be considered and acted upon are:

A notification mechanism has been setup to allow the kernel to notify the daemon that it should perform some recovery action.

gonzoleeman commented 8 years ago

I might be interested in helping

ryansyah commented 5 years ago

Hello @nsinenian. I ran into an issue today where I booted up an old system that was configured to auto-login to an iscsi target that was actively mounted from another machine. The result is now neither machine can use the target and it presents itself as uninitialized.

screen shot 2018-12-19 at 12 26 31 pm

I would be glad to hear what sorts of options exist for data recovery and would be a willing tester in this process. Is there a way to just re-create the partition information in a manner to support data recovery or fix the LUN for reuse that does not destroying the data on the LUN?

nsinenian commented 5 years ago

Hi Ryan - as a point of clarification, the error recovery referenced in this thread has to do with iSCSI session recovery (e.g., when there is a connectivity issue during normal use and a file isn't fully written, etc). This is different from the issue you cite.

iSCSI is by design not a sharing protocol and hence not meant to be accessed from multiple sources without any kind of semaphore. Hence the behavior you have observed is expected. I'm a bit surprised that the target didn't lock it out -- I suggest configuring the target in the future to only allow a single connection per LUN, if possible.

Basically, it's like having a hard drive exposed to two different computers that are trying to write to it at the same time and leads to corruption.

My guess is that the data is still there and perhaps the MBR/GPT is corrupt. However, the fact that the LUN doesn't show up properly (Zero KB) is discouraging, as it suggests corruption at a more lower level. What kind of file system was formatted onto the drive?

I assume the LUN is backed by a file on the server side. I'd suggest making a backup of that file before you try to do anything in terms of recovery.

For recovery, you can mine that file to recover data. I would also suggest, if your target allows for it, to make a new iSCSI target without a LUN, and then move this LUN into that target.