Nandaka / PixivUtil2

Download images from Pixiv and more!
http://nandaka.devnull.zone/
BSD 2-Clause "Simplified" License
2.38k stars 255 forks source link

Discussion: Ugoira animations and metadata #69

Closed reyaz006 closed 9 years ago

reyaz006 commented 9 years ago

Recently, an image viewer for Windows named HoneyView received a support for viewing Ugoira animated format. It was per feature request and not present in the latest final build yet. Details here.

Right now, actual requirements in HoneyView for the file to be processed as Ugoira animation are:

Metadata is currently located in plain text on the image page on Pixiv, and it likely won't change any time soon. There won't be any official way to either save Ugoira zip as a file or animation, and metadata will stay in plain html text. So, maybe if there is enough demand from several 3rd party tools like downloaders and viewers, we may come to agreement over an unofficial way of keeping Ugoira animations as files.

Converting Ugoira to anything else is quite a problem (additional quality loss, playback difficulties), I'd rather see it as a proper animation format - "frames-in-zip" idea feels quite simple already.

I'd like to ask everyone interested to discuss this and possibly formulate the requirements for future use of these animations as files, hopefully in more tools.

There are several issues for now:

  1. animation.json file requirement originates from here - this is a collection of bookmarklets that allow saving Ugoira as files in various formats. The first DL_zip bookmarklet links to the javascript that contains the code for converting metadata into the file and putting it into the original .zip:

    zip.file("animation.json", JSON.stringify(pixiv.context));

    This is by no means an official "way" of keeping the metadata (the said javascript is not made by Pixiv). The said javascript can be changed or wiped in future, so it may not produce animation.json anymore.

  2. Including the metadata as a file inside the .zip package invalidates the file. Afaik, PixivUtil2 can check filesize with server with 'alwayscheckfilesize' option. If used with files modified in such a way, it will not make sense.
  3. Files that are already downloaded will need to be either re-downloaded (in case PixivUtil2 learns to include metadata inside the .zip) or processed with some script (to insert .js into .zip and rename .zip to .ugoira for each file).

I think there are several options here that can be applied individually or together, to improve the situation:

EDIT: There may be another option (although highly unlikely):

reyaz006 commented 9 years ago

Ugoira support got included in latest HoneyView build 5.11 now - changelog.

Nandaka commented 9 years ago

I think I can add extra field in the animation.json for the original zip size.

Nandaka commented 9 years ago

try http://www.mediafire.com/download/x593yibkh74mu53/pixivutil20150315-beta1.7z

reyaz006 commented 9 years ago

Thanks!

I've set this before running:

writeugoirainfo = False
createugoira = True

So far I can see that it now creates .zip and .ugoira. The .ugoira itself works fine.

As you already mentioned on the blog, modifying original .zip is not very good way, and I agree. But for a it to be a proper image, it would be a valid requirement to keep all data inside 1 file.

What I'd like to know:

  1. What happens when original file gets modified and user has 'alwayscheckfilesize' enabled? Both .zip and .ugoira get overwritten?
  2. Is .zip being kept only for 'alwayscheckfilesize'=true case? If so, wouldn't it be better to not keep it when 'createugoira'=true? If only filesize gets checked, maybe better keep it elsewhere? Because right now both files are mostly dupes, and each Ugoira may be 1~30 mb. Here are some examples I can think of:
    • keeping it inside the database may be bad, so keep it in a separate file near the .ugoira, e.g. in .txt.
    • since we already modified original .zip, might aswell go and add that info inside .ugoira as a .txt file or just its name (e.g. empty file named 'zipsize=1234567'). Then check the size by checking .txt contents or its name.
    • there are methods of checking contents of .zip archive without downloading actual file. After we get only its header, we can see sizes of all files inside. It may take more time to fetch that for each .zip though - each header may be 1-10 kb.
    • finally, there are modification dates on server that can be fetched in the same way like filesizes. I'm sure you are familiar with that and with all difficulties it involves - it may be a bad option.

Previously I thought that additionally allowing user change "animation.json" default string via config may be a good idea. But not anymore: keeping it fixed and simple might be better for format adoption. And what do you think?

Nandaka commented 9 years ago
  1. Yap.
  2. Kinda, no checking logic for .ugoira files yet, but I have add additional field in the animation.json called zipSize which store the zip file size. Refer to https://github.com/Nandaka/PixivUtil2/blob/master/PixivModel.py#L547

Still thinking how to implement the checking :P

reyaz006 commented 9 years ago

Well, since all files inside .ugoira are not compressed anyway, you may be able to read zipSize straight from the file.

Or are there difficulties with logic when user already has some ugoira downloaded as .zip?

reyaz006 commented 9 years ago

Here is my batch script for converting existing pairs of .zip.js and .zip into .ugoira. It should do all the same job as the pixivutil20150315-beta1. Logic is less complex but it should provide exactly same result. If the functionality changes in later versions, I'll try to update it.

update 15.03.23: corrected error in get path

@echo off
::prepare vars
set jsname=%1
set jsname=%jsname:"=%
set zipname=%jsname:.zip.js=.zip%
set ugoiraname=%zipname:.zip=.ugoira%
::make sure to only accept .zip.js
echo %1 | findstr /C:".zip.js">nul && (goto :process) || (echo Not .zip.js! & pause & exit)
:process
::get path
for /f "delims=" %%F in (%1) do (set filepath=%%~dpF)
for /f %%F in (%1) do (set rawpath=%%~dpF)
::get filesize of .zip
for %%A in ("%zipname%") do set size=%%~zA
::change work directory to our base folder
cd /d "%~dp0"
::display all collected info
echo path:      %filepath%
echo raw path:  %rawpath%
echo .js path:  %jsname%
echo .zip path: %zipname%
echo .ugoira:   %ugoiraname%
echo zipSize:   %size%
::check if Windows can process the path properly (e.g. there are problems with U+3000)
if not "%filepath%" == "%rawpath%" (echo Path contains illegal characters! & pause & exit)
::remove this pause for silent processing
pause
::prepare animation.json
copy /Y "%jsname%" animation.json
::put zipSize info inside animation.json
fart animation.json ]} ],\"zipSize\":%size%}
::add animation.json into .zip
7za a -mx0 -tzip -mtc=off "%zipname%" animation.json
::rename .zip.js to .zip.js_processed
ren "%jsname%" *.js_processed
::rename .zip to .ugoira
ren "%zipname%" *.ugoira
::clean up
del animation.json

Usage:

Nandaka commented 9 years ago

Updated with http://www.mediafire.com/download/uy8ovvcp1luru51/pixivutil20150321-beta2.7z

reyaz006 commented 9 years ago

Before testing beta2 with alwayscheckfilesize=true, should I convert all my Ugoira .zip+js to .ugoira?

Nandaka commented 9 years ago

I think you need to.

It check if the local zip file is exists and have the same size with server size, if true, it will skip the download and ugoira creation.

see https://github.com/Nandaka/PixivUtil2/commit/a9b2a086cd7b7feacb747dcbc5cb0eeae0899e40#diff-7a612caa8f02f7c30e49d68a8c1ab43fR644

reyaz006 commented 9 years ago

Tested it, found few issues:

  1. It still leaves .zip files near .ugoira. So I was deleting those by hand before testing with alwayscheckfilesize=true.
  2. When I already have .ugoira downloaded and there is no corresponding .zip file: if local zipSize saved inside json is increased a bit - upon checking it says "Local is larger" and does nothing. Not sure if it's same for usual images. I find it wrong to not get the new file if it's smaller than previous version.

Other than that, it works well for me.

Nandaka commented 9 years ago
  1. I haven't add an option to delete the original zip file :P
  2. It is by design. I think someone request to keep the old file if the local file is larger.
reyaz006 commented 9 years ago
  1. I guess you will add it anyway, since filesize checking works properly with just .ugoira files.
  2. Thinking of myself many years ago, I too could request such a thing. Made this https://github.com/Nandaka/PixivUtil2/issues/71
reyaz006 commented 9 years ago

There is also a minor problem upon exiting now:

Traceback (most recent call last):
  File "PixivUtil2.py", line 1756, in 
  File "PixivUtil2.py", line 1752, in main
NameError: global name 'exit' is not defined
Nandaka commented 9 years ago

crap forgot to add os. :smile: