darrenldl / blockyarchive

Blocky archive - multithreaded archiver offering bit rot protection and sector level recoverability
MIT License
95 stars 4 forks source link

Different Sector-Sizes when migrating ArchiveFiles - an issue? #278

Open derTHOMMESisses opened 3 years ago

derTHOMMESisses commented 3 years ago

first, congrats to that nice piece of software!

I have one question: I am wondering what happens if I do not know the FileSystem where the Blockyarchive-file is stored at in the end. Example: I build an Archive and Blockyarchive it to my NAS. From there the Archive-File is moved (copied/deleted) to an robotic Tape-Archive (Intel Tivoli Storage Management which is run y my company). In the future it will be migrated to System XY. When the Sector-Size is different from that system where the Blockyarchive was created - is that an issue?

I am not an IT-expert, so please excuse me if this question is irrelevant...

regards Tom

darrenldl commented 3 years ago

Hi Tom,

first, congrats to that nice piece of software!

Thanks : D !

I'm still glad/surprised no one raised any issues yet since the last release (I know someone who uses this for TBs of data).

I have one question: I am wondering what happens if I do not know the FileSystem where the Blockyarchive-file is stored at in the end. Example: I build an Archive and Blockyarchive it to my NAS. From there the Archive-File is moved (copied/deleted) to an robotic Tape-Archive (Intel Tivoli Storage Management which is run y my company). In the future it will be migrated to System XY. When the Sector-Size is different from that system where the Blockyarchive was created - is that an issue?

I am not an IT-expert, so please excuse me if this question is irrelevant...

It is indeed a relevant question, but the accurate answer depends on your goals/use.

General consideration

There are two aspects to the default format (or any configuration of the ECSBX format):

  1. Facilitation of archive discovery during data recovery (property inherited from SBX format)
  2. Facilitation of data repair (forward error correction provided in ECSBX format)

Property 1 is indeed sensitive to the storage sector size used, which would need to take both file system and hard disk sector into account. However, since basically all modern file systems and harddisk currently use a sector size that is a multiple of 512 bytes, modern configurations should work with ECSBX v17 (or SBX v1 if you don't need data repair).

Marco (the original author of SeqBox/SBX) has compiled a list of FS that should work: https://github.com/MarcoPon/SeqBox/#tests

Above basically means as long as blocks are only broken apart at 512 bytes intervals at raw/byte level at the storage medium in question, then it's fine. The sector size of the system where the archive was created does not matter, only the sector size of the system storing the archive does.

Property 2 is somewhat independent from the storage sector size, and would still allow you to recover from bit rot or corruption during transport etc. But obviously if you cannot get sufficient number of blocks to begin with during disaster recovery phase, you cannot begin the repair.

Specific answers

I am wondering what happens if I do not know the FileSystem where the Blockyarchive-file is stored at in the end. Example: I build an Archive and Blockyarchive it to my NAS. From there the Archive-File is moved (copied/deleted) to an robotic Tape-Archive (Intel Tivoli Storage Management which is run y my company).

Unfortunately I am not familiar with tape systems, but my impression is that they don't have any block alignment requirement, and so file system sector size would be the major deciding factor. If the file system is backed up as is, then you'll need to check the sector size of the system housing the ECSBX file. If backup is done at file level, then it depends on whether a file system is used at all for the tape (for this case then sector size is the deciding factor), or if things are all simply concatenated (this complicates the data rescue process slightly, but you can ask blockyarchive to scan at a offset: wiki entry, so not too big of a deal).

In the future it will be migrated to System XY. When the Sector-Size is different from that system where the Blockyarchive was created - is that an issue?

Somewhat depends on migration details, but normally only the sector size of System XY would matter.