zmanda / amanda

Amanda Network Backup
https://www.zmanda.com/downloads/
Other
222 stars 107 forks source link

Short write on tape device #257

Open cscottpnw opened 3 months ago

cscottpnw commented 3 months ago

I continue to get this error using Amanda 3.5.1 on FBSD 14.0 (GENERIC kernel). The error has occurred with 2 tape drives, and with multiple tapes.

hostname/ lev 0 partial taper: Error writing block: Short write on tape device: Tried 32768, got 27136. Is the drive using a block size smaller than 32768 bytes?

Tried is always 32768, however "got" varies a lot.

The associated amanda.conf:

`org "MONTHLY" mailto "dude@hostname.local" dumpuser "amanda" inparallel 10 dumpcycle 0 runspercycle 1 tapecycle 18 runtapes 1 tapedev "tape:/dev/nsa0" tapetype LTO2-200 infofile "/usr/local/etc/amanda/monthly/curinfo" logdir "/usr/local/etc/amanda/monthly/log" indexdir "/usr/local/etc/amanda/monthly/index" tapelist "/usr/local/etc/amanda/monthly/tapelist" holdingdisk hd1 { comment "main holding disk" directory "/holddisk" use 338G }

define tapetype LTO2-200 { comment "LTO Ultrium2 200/400" length 400000 mbytes filemark 0 kbytes speed 20570 kps }

define dumptype full_compress { index yes strategy noinc program "GNUTAR" compress server best exclude list optional "/var/lib/amanda/exclude.gtar" } define dumptype full_nocompress { index yes strategy noinc program "GNUTAR" compress none exclude list optional "/var/lib/amanda/exclude.gtar" }

define dumptype zwc-normal { program "DUMP" index yes strategy noinc compress none } `

LTO drive details: Drive: sa0: <HP Ultrium 2-SCSI S65D> Serial Number: XXXXXXXX

Mode Density Blocksize bpi Compression Current: 0x42:LTO-2 variable 187909 enabled (0x1)

Current Driver State: at rest.

Partition: 0 Calc File Number: 6 Calc Record Number: 0 Residual: 32768 Reported File Number: 6 Reported Record Number: 264535 Flags: None

Tape I/O parameters: Maximum I/O size allowed by driver and controller (maxio): 65536 bytes Maximum I/O size reported by controller (cpi_maxio): 0 bytes Maximum block size supported by tape drive and media (max_blk): 16777215 bytes Minimum block size supported by tape drive and media (min_blk): 1 bytes Block granularity supported by tape drive and media (blk_gran): 0 bytes Maximum possible I/O size (max_effective_iosize): 65536 bytes

I've run out of avenues of troubleshooting on this. Any suggestions?

exuvo commented 2 months ago

The setting you want to modify is tapetype { blocksize setting as it defaults to 32KB. Try setting it lower maybe? I use 1MB on my LTO5 but LTO2 is tiny in comparison.

define tapetype LTO2 {
  blocksize 24 kbytes
}
cscottpnw commented 2 months ago

I've tried a few different blocksize settings. None had an impact. The config I provided above just happened to not contain a blocksize.

Despite the history, I just tried a blocksize of 64 kybtes. Same problem

define tapetype LTO2-200 {
     comment "LTO Ultrium2 200/400"
     length 400000 mbytes
     blocksize 64 kbytes
     speed 20570 kps
}

Error:

FAILURE DUMP SUMMARY:
  hostname / lev 0  partial taper: Error writing block: Short write on tape device: Tried 65536, got 9216.  Is the drive using a block size smaller than 65536 bytes?

Thanks anyway,.

calhariz commented 2 months ago

Could it be you have a problem with the hardware or BSD? Have you tried to write using tar to the tape device?

exuvo commented 2 months ago

Oh yeah if you haven't tested manually: position the drive with mt, write some data to it (ex tar or dd, does not need to be a tape aware program for less than full tape writes), rewind with mt, read and verify the written data.

cscottpnw commented 2 months ago

It could FBSD, but the issue still seems specific to AMANDA. Prior to opening this Issue, I successfully wrote/retrieved via dd, dump, gnutar, and bsdtar. The LTO drive was replaced, and produces the exact same error. I briefly considered the SCSI card, but there's zero evidence its having a problem. I've been successfully using tar as a mitigation measure since this issue began.

Nothing in the system reports problems. There's never anything relevant in dmesg, or any system logs. My current hunch is the OS is causing this. Same setup ran successfully on FBSD 12.x. This seems to start with FBSD 14.x. I haven't received response from the FBSD community. There doesn't seem to be many AMANDA users in that community. FWIW, I've been using AMANDA for over 25 years, and have never encountered a problem like this.