w3c / epub-tests

Test repository for EPUB3, maintained by the EPUB3 WG
https://w3c.github.io/epub-tests/
Other
22 stars 20 forks source link

Test ocf-zip-comp - OCF ZIP archives should support the "store" and "deflate" method. #281

Closed larscwallin closed 1 year ago

larscwallin commented 1 year ago

According to the spec, "store" is also allowed. https://www.w3.org/TR/epub-33/#sec-zip-container-zipreqs

As noted in the Audiobook spec. audio and video files should preferably be archived using the "store" method in order to facilitate streaming.

https://www.w3.org/TR/lpf/#sec-compression

iherman commented 1 year ago

I realize that, but I am not sure this "MAY" think should be part of the basic test. What we could do is to create a duplicate of this test, flagging it as a "MAY" test for that feature, and do what you think is better. Do you think you can provide such an alternate test?

larscwallin commented 1 year ago

I definitly think we should update the existing test to reflect the spec. If you pass the current test you are not EPUB compliant as this part of the spec is a MUST. Indeed, the mimetype file in the OCF MUST use the "store" method.

Regarding the reference to LPF and the streaming capabilities of reading systems, this does not need to be tested I think. If this is what you refered to with regards to the duplicate test.

iherman commented 1 year ago

@Jeffxz I may need your help.

I wonder whether, in fact, we have not made a mistake at some point with this test. I would expect the idea was to create an epub file, ie, a zip file that would not be properly generated. However, due to some manipulations in the past few months we (well, probably I) probably just re-generated the zip files with a script which generated the "right" (zip) epub file. Indeed, at this moment the ocf-zip-comp.epub file simply loads into, say, Thorium without any further ado.

At present, the script uses the following two python lines:

os.system("cd %s; zip -X ../%s.epub mimetype" % (dname,dname))
os.system("cd %s; zip -rDX9 ../%s.epub * -x mimetype -x \*.DS_STORE" % (dname,dname))

where 'dname' is the directory name. But I am not even sure what alternative flags would I have to use to make this test correct. After all, as @larscwallin says, a 'store', ie, no compression (which would mean to replace the '9' to '0' in the script) is still correct. Should the argument -Z bzip2 be used in the second line to force a bzip2 compression? Would that make the test all right?

(Clearly, I am not really familiar with the intricacies of zip...)

Jeffxz commented 1 year ago

Hey, @iherman I only have Ubuntu but I think the zip command should be similar to OSX. I did some experiments and I think using -rDX0 in the second zip command should be good enough.

The following is using -rDX0 as you can see from the log all the files are "stored"

$ zip -rDX0 ../accessible.epub * -x mimetype 
  adding: EPUB/ch03s05.xhtml (stored 0%)
  adding: EPUB/pr01s04.xhtml (stored 0%)
  ...

When using deflated mode the output looks like this

$ zip -rDX9 ../accessible.epub * -x mimetype 
updating: EPUB/ch03s05.xhtml (deflated 68%)
updating: EPUB/pr01s04.xhtml (deflated 56%)
...
iherman commented 1 year ago

@Jeffxz yes but... in my understanding, using -rDX0 means use a 'store' without any compression. A @larscwallin noted, per spec, that is also allowed for EPUB, and your test's description says that the readers should fail or raise an error message.

Jeffxz commented 1 year ago

ohhhh, I see. So it's not about the store or the deflate but the test case itself. I used bzip2 to create the epub file https://github.com/w3c/epub-tests/commit/24ab2a5171951d0334c2b085beb1f40eb2309e01 According to Wikipedia https://en.wikipedia.org/wiki/ZIP_(file_format)#:~:text=The%20.ZIP%20File%20Format%20Specification%20documents%20the%20following%20compression%20methods,IBM%20z%2FOS%20CMPSC%20instruction. This is different compression algorithm compared with Deflate.

iherman commented 1 year ago

ohhhh, I see. So it's not about the store or the deflate but the test case itself. I used bzip2 to create the epub file 24ab2a5 According to Wikipedia https://en.wikipedia.org/wiki/ZIP_(file_format)#:~:text=The%20.ZIP%20File%20Format%20Specification%20documents%20the%20following%20compression%20methods,IBM%20z%2FOS%20CMPSC%20instruction. This is different compression algorithm compared with Deflate.

Ah. So indeed, I made a mistake at some point, by regenerating the wrong file.

Unfortunately, https://github.com/w3c/epub-tests/commit/24ab2a5171951d0334c2b085beb1f40eb2309e01 does not include the (erroneous) epub file itself. Just to be on the safe side, could you simply send it to me by email, so that I could refresh the test suite?

Jeffxz commented 1 year ago

oops, sorry about that. I will re-generate it and send it over shortly.