weizhenye / ass-compiler

Parses and compiles ASS subtitle format to easy-to-use data structure
https://ass.js.org/ass-compiler/
MIT License
107 stars 18 forks source link

Some additional parsing errors for these `ass` subtitles #10

Closed Deathspike closed 1 year ago

Deathspike commented 1 year ago

Thank your the fix in #9! I have identified three more with problematic subtitles on 0.1.5. These errors are in the online viewer:

TypeError: Cannot read properties of undefined (reading 'parsed')
    at compileDialogues (ass-compiler.js:813:26)
    at Object.compile (ass-compiler.js:972:18)
    at run ((index):132:32)
    at codemirror.js:2076:45
    at codemirror.js:2033:24
    at codemirror.js:2047:11
    at jn (codemirror.js:3787:15)
    at codemirror.js:3924:17
    at a (codemirror.js:6495:11)
    at n.onload (codemirror.js:6521:11)

And:

TypeError: Cannot read properties of null (reading '1')
    at parseTag (ass-compiler.js:79:22)
    at Array.map (<anonymous>)
    at parseTags (ass-compiler.js:174:17)
    at parseText (ass-compiler.js:184:18)
    at parseDialogue (ass-compiler.js:231:22)
    at parse (ass-compiler.js:313:49)
    at Object.compile (ass-compiler.js:960:16)
    at run ((index):132:32)
    at codemirror.js:2076:45
    at codemirror.js:2033:24

The problematic subs are from 3 different series/groups: subs.zip

Thank you for all your amazing work on this library 👍

weizhenye commented 1 year ago

In Bleach - S01E12.eng.ass, the format line of dialogues is

Format: Layer, Start, End, Style, Text

where the standard line is

Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text

This file cannot be opened by Aegisub:

image

Previous I aligned ass-compiler with Aegisub (ce31643398dce93a80af5afb983725516f2871d4), but it might be valid in ASS spec.

Do you know how this file is generated and in which video player it can be rendered?

Deathspike commented 1 year ago

Unfortunately, I don't know which software was used to edit the Bleach subtitle. It was released by AnimeRG release group.

I've tried to play these in mpv, Plex (which is mpv-based), and vlc. All of them work well.

I don't think fields are required according to the ass-spec. If I look at http://www.tcax.org/docs/ass-specs.htm, I don't see any specifications about requirements on the dialogue Format line. But I'm not too sure about that. All I know is that most video players seem to have no issues with the omission of some of those fields.

weizhenye commented 1 year ago

fixed in v0.1.6

Deathspike commented 1 year ago

Thanks once again! The changes in 0.1.6 parse now, but the result of a compile is unfortunately not 100% correct.

Using the earlier Bleach subtitles, the resulting style lines will be:

Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,Arial,16,&Hffffff,&Hffffff,&H0,&H0,0,0,0,,,,,,1,1,0,2,10,10,14.4,0

The values of StrikeOut, ScaleX, ScaleY, Spacing, and Angle are empty because they didn't exist in the original file. I think that empty is interpreted as 0 by asslib. For most styles that will work fine, but for ScaleX and ScaleY the value 0 means the subtitle will be scaled to 0%, and is therefore not visible. The above styles will not show anything in any of the aforementioned video players. By manually setting ScaleX and ScaleY to 100, everything works as expected. The result should therefore be:

Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,Arial,16,&Hffffff,&Hffffff,&H0,&H0,0,0,0,,100,100,,,1,1,0,2,10,10,14.4,0

Is this something you want to solve in ass-compiler, or is fixing this the responsibility of the application consuming the library?

weizhenye commented 1 year ago

What's asslib? How it uses the output of ass-compiler?

There are parse and compile modes in ass-compiler, in my option, parse should keep the original data from the ass file, while compile can set default values and do some transform as long as the decompiled ass file renders same.

Now the compile result lacks some Format fields, I'll fix it later, but I'd like to keep the current parse result.

Deathspike commented 1 year ago

Sorry, I meant libass! It reads an empty field in ScaleX/ScaleY as 0%, so nothing will be rendered.

Omitting the Format field entirely when there is no value would work, but that could make the result incompatible with Aegisub, which expects all of the fields...

If I look at the source code, I see that there is a set of default values that libass uses for styles. It can be found here https://github.com/libass/libass/blob/c0a158e00bd6c4386e943876d7eee5d767e9a498/libass/ass.c#L211. ScaleX and ScaleY are divided by 100 here https://github.com/libass/libass/blob/c0a158e00bd6c4386e943876d7eee5d767e9a498/libass/ass.c#L711, so their default values need to be 100.

Can ass-compiler set the same default values if a field is empty? I would prefer that this happens during parse. Even though this might be an incorrect representation of the original file, this will only happen when the field is omitted in the source file. The library would effectively just set default values if the original file does not have those fields, otherwise it would respect the source file. Since compile emits all fields regardless of how many there are in the input, this makes everything consistent and logical.

It would be a bit weird to parse an ass file without a FontSize, for example, and having to think about that when you're writing code on top of the ass-compile results. But when you compile it, it does get a default value. That would be really weird from an end-user perspective.

weizhenye commented 1 year ago

Sorry, I misunderstand your previous comment.

In parse result, parsedJSON.styles.style[].ScaleX should not exist, and in stringify result, the [V4+ Styles] section should be

Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, AlphaLevel, Encoding
Style: Default,Arial,16,&Hffffff,&Hffffff,&H0,&H0,0,0,0,1,1,0,2,10,10,10,0,0

In compile result, compiledJSON.styles.Default.style.ScaleX should be 100, and in decompile result, the [V4+ Styles] section should be

Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,Arial,16,&H00ffffff,&H00ffffff,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,1,0,2,10,10,10,0
Deathspike commented 1 year ago

Yes, that should work, then. If I run this snippet using 0.1.6:

const sub = ass.parse(`
[Script Info]
ScriptType: v4.00+
PlayResX: 384
PlayResY: 288

[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, AlphaLevel, Encoding
Style: Default,Arial,16,&Hffffff,&Hffffff,&H0,&H0,0,0,0,1,1,0,2,10,10,10,0,0
`);
const res = ass.stringify(sub);
console.log(res);

Then the result is:

[Script Info]
ScriptType: v4.00+
PlayResX: 384
PlayResY: 288

[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,Arial,16,&Hffffff,&Hffffff,&H0,&H0,0,0,0,,,,,,1,1,0,2,10,10,10,0

It does contain an empty ScaleX and ScaleY. This will cause the issue I'm having where subtitles won't appear anymore. Either the Format should not contain those fields, or a default value of 100 should be set. I hope this clears up the confusion of my previous comment

weizhenye commented 1 year ago

fixed in v0.1.7