Piping from stdin - Githubissues

ar1a commented 6 years ago

Hi! Is it possible to pipe from stdin to create par files? (Say, backing up a drive and piping it from dd)

animetosho commented 6 years ago

In theory, assuming you're only processing one file, it's possible if recovery can be computed in one pass (i.e. all recovery data can fit within specified memory limits). If all recovery data cannot fit within memory, then multiple read passes are required, so reading data from stdin won't work.

Some alternative ideas I've thought about:

allow specifying a process from which stdout is read - this is slightly different from reading stdin in that ParPar can execute the process multiple times for subsequent read passes if necessary
some sort of 'tee'-like functionality where the input is copied onto stdout. This is useful if you don't actually need to process from a pipe, but merely want to avoid needing another program to re-read the file off disk.

Note that there's also a minor complication of needing to know the file size ahead of time, which a pure pipe cannot give, and PAR2 also requires file names, so these would need to be specified alongside the pipe. I'm considering something similar to what I do in Nyuu for piping data in.

ar1a commented 6 years ago

By "recovery data can fit within specified memory limits" you mean like -s 1M -r 64 -- would require 64MB, or do you mean the entire file would have to fit in ram (in which case that wouldn't work at all for backing up a block device)

I presume there could maybe be some special dd functionality because you can seek via blocksize/seek/count, but this is getting beyond my realm of knowledge

animetosho commented 6 years ago

Only the recovery data needs to fit in RAM, not the input data - so, in other words, your first example (64MB).

Devices are interesting. I don't think they report their size, so you'd still need to tell ParPar that info, but the rest seems doable, yes.

ar1a commented 6 years ago

maybe you could use sfdisk?


~ ❯❯❯ sudo sfdisk -d /dev/sda
label: gpt
label-id: C2F5CB07-xxx
device: /dev/sda
unit: sectors
first-lba: 2048
last-lba: 234441614

/dev/sda1 : start=        2048, size=      204800, type=C12A7328-xxx, uuid=CB3D66E5-xxx 
/dev/sda2 : start=      206848, size=   234234767, type=0FC63DAF-xxx, uuid=F3F65DF1-xxx```

animetosho commented 6 years ago

I didn't want to get too platform specific, and I don't think it works if there's no partition table. It's probably better to get the user to put in $(cat /sys/block/sda/size) into the command line somewhere.

Thanks for the suggestion nonetheless.

Thinking about it a bit, maybe I can bypass the need for a known size if there's only one input file. The PAR2 format requires it for file ordering, which is then used in recovery computation, but if there's only one file, there's nothing to order, which means the computation could go ahead without a size... The code, as it is, does assume a known size in places, so I will have to check whether this is doable.
It will still be up to the user to select an appropriate slice size so that there's <= 32768 input slices in total.

animetosho / ParPar

Piping from stdin #13