jamesmeneghello / pynab

Newznab-compliant Usenet Indexer written in Python, using PostgreSQL/MySQL-like.
Other
209 stars 44 forks source link

release size estimation is off #284

Open ghost opened 8 years ago

ghost commented 8 years ago

I don't know how pynab is estimating file size but it's off for me. I almost wonder if it's due to the fact that I'm running on an LZ4 compressed ZFS filesystem.

Consider the attached screenshot. My indexer is "A My PyNab" (I was hoping alpha sort affects search order)

screen shot 2016-05-18 at 16 12 01

For the exact same release NZBs.org says it's 1.9GB while PyNab says it's 1.6GB.

gkoh commented 8 years ago

pynab's unrar module has had trouble in the past, initially I thought this was related to #280, so I had a look at the rar headers for the aforementioned release. It looks like this (according to unrar):

 unrar va -kb 12.Monkeys.S02E04.1080p.WEB-DL.DD5.1.H.264-VietHD.part01.rar 

UNRAR 5.30 beta 6 freeware      Copyright (c) 1993-2015 Alexander Roshal

Archive: 12.Monkeys.S02E04.1080p.WEB-DL.DD5.1.H.264-VietHD.part01.rar
Details: RAR 4, volume

 Attributes      Size    Packed Ratio    Date    Time   Checksum  Name
----------- ---------  -------- ----- ---------- -----  --------  ----
    ..A.... 1761846484  24999870  -->  2016-05-11 01:17  259FE182  12.Monkeys.S02E04.1080p.WEB-DL.DD5.1.H.264-VietHD.mkv
----------- ---------  -------- ----- ---------- -----  --------  ----
           1761846484  24999870   1%  volume 1                 

On first glance the size matches neither NZBs.org or pynab. However, if we assume binary exponent sizing, (1761846484/2^30) = 1.64GiB. This matches pynab's sizing.

Going further, if you look at the other parts of the release, you note it includes par2 data, which adds 225+MB or so, add that to the RAR files and you get around 1.9GB, matching NZBs.org.

So ... pynab provides sizing estimates of the uncompressed rar content only, looking in lib/rars.py:check_release_files() seems to confirm this.

I guess the question is what is wanted?

  1. uncompressed RAR on disk file size (pynab current behaviour)
  2. actual download size (RAR + PAR2) (newznabplus behaviour)
ghost commented 8 years ago

nn+ behavior is probably warranted even if it's technically wrong. Doesn't Couchpotato also consider release size in its weighting system? Then again, I see real value in the increased accuracy...