animetosho / ParPar

High performance PAR2 create client for NodeJS
190 stars 19 forks source link

Guidance on how to pick slices and max/min inputs? #30

Closed wodano closed 3 years ago

wodano commented 3 years ago

As a noob, I don't really know the ideal way of picking correct amount of slices, is there a guide on how to pick? So far I've just been doing random numbers with:

~/ParPar-0.3.2/bin/parpar.js -s random$M -r12% -o input ~/input.mkv

Sorry if this is really basic, but I haven't seen other people ask this question on Reddit or elsewhere.

Edit: I also meant to ask how slices should be adjusted according to the size and amount of the file(s).

animetosho commented 3 years ago

Yeah, it's unfortunate that it's a required parameter that can't be auto determined, and picking an appropriate size isn't always straightforward.

At a basic level, PAR2 splits the input file into slices, which is what recovery is computed from (if there's more than one input file, each file is broken into slices). The slice size allows you to adjust how the splitting is performed - if you choose to have a lot of slices, then each slice must be smaller than if you chose to have fewer slices.
Note that I refer to the above as "input slices", as it's slicing the input data. The PAR2 creation takes these input slices, and generates a number of recovery slices (where the slice size for input and recovery slices are identical), which is written to the output PAR2 files.

Hopefully the following points gives you some idea of how to pick a size:

To use a rather bad analogy: consider some giant pizza that can be sliced into pieces, but once done, cannot be further sliced/divided (say, the knife is taken away for some reason). Now, if some bird poop lands on a slice, that slice has to be completely thrown away. Smaller slices means less wastage when such an event occurs (though this is slightly counter balanced by larger slices having a slightly higher probability of more than one poop landing on it).
Cutting a giant pizza takes time though, particularly if you want a lot of small slices, so you may not have the patience to cut it up finely (and your guests may not like having to deal with lots of tiny slices). Also, it's rather pointless to slice it up so fine that a bird dropping affects multiple slices.

Back to reality: to sum it up, consider the following:

wodano commented 3 years ago

Thank you so much for taking the time to write all of that!

where PAR2 is being used may play a role too.

I'm so glad I asked now, my slice sizes have been larger than 700KB which I see is an issue. Anyway, thanks for the help and the pizza analogy was a nice way to describe it. I think I'll be going with a middle ground of speed vs recoverability. Thanks!