DrKain / subclean

A cross-platform CLI tool and node module to remove advertising from subtitles. Supports Bazarr and bulk cleaning!
MIT License
54 stars 5 forks source link

Error: expected timestamp at row X #7

Closed Obscurax closed 2 years ago

Obscurax commented 2 years ago

Hi,

When I run subclean I experience the following error. Any hints how to solve this?

c:\ProgramData\subclean>subclean.exe subtitle.srt -o subtitle.test.srt
D:\snapshot\subclean\node_modules\subtitle\dist\subtitle.cjs.development.js:286
      throw this.getError('timestamp', this.state.row, line);
      ^

"rror: expected timestamp at row 2, but received: "00:00:40,908 --> 00:00:45,908
    at Parser.getError (D:\snapshot\subclean\node_modules\←[4msubtitle←[24m\dist\subtitle.cjs.development.js:222:12)
    at Parser.parseTimestamp (D:\snapshot\subclean\node_modules\←[4msubtitle←[24m\dist\subtitle.cjs.development.js:286:18)
    at Parser.parseLine (D:\snapshot\subclean\node_modules\←[4msubtitle←[24m\dist\subtitle.cjs.development.js:242:11)
    at D:\snapshot\subclean\node_modules\←[4msubtitle←[24m\dist\subtitle.cjs.development.js:91:19
    at Array.forEach (<anonymous>)
    at Object.parseSync (D:\snapshot\subclean\node_modules\←[4msubtitle←[24m\dist\subtitle.cjs.development.js:90:44)
    at Subclean.clean (D:\snapshot\subclean\lib\index.js:184:32)
    at Subclean.init (D:\snapshot\subclean\lib\index.js:39:14)
    at Object.<anonymous> (D:\snapshot\subclean\lib\index.js:235:16)
    at Module._compile (pkg/prelude/bootstrap.js:1433:22)
DrKain commented 2 years ago

This looks to be an issue with the subtitle parser.
Can you please provide a copy of the subtitle file so I can test locally?

Obscurax commented 2 years ago

subtitle.zip

I've attached the srt, in a zip file. GitHub doesn't allow srt files.

Thanks for the ultra fast response.

DrKain commented 2 years ago

Thank you for the sample.
The issue was the file you sent contained a bizarre number of \r that broke the parser.
This is fixed in 1.2.7, all cases of \r will be stripped before parsing

https://github.com/DrKain/subclean/releases/tag/v1.2.7

1

Obscurax commented 2 years ago

I unleashed your script on my library after the fix and all the movies went fine (around 1500) but I caught another timestamp error on a TV show subtitle. I've attached the srt.

Awesome program/script btw! Love it.

subtitle.zip

D:\snapshot\subclean\node_modules\subtitle\dist\subtitle.cjs.development.js:286
      throw this.getError('timestamp', this.state.row, line);
      ^

Error: expected timestamp at row 1, but received: "��1 "
    at Parser.getError (D:\snapshot\subclean\node_modules\subtitle\dist\subtitle.cjs.development.js:222:12)
    at Parser.parseTimestamp (D:\snapshot\subclean\node_modules\subtitle\dist\subtitle.cjs.development.js:286:18)
    at Parser.parseId (D:\snapshot\subclean\node_modules\subtitle\dist\subtitle.cjs.development.js:280:12)
    at Parser.parseHeader (D:\snapshot\subclean\node_modules\subtitle\dist\subtitle.cjs.development.js:259:14)
    at Parser.parseLine (D:\snapshot\subclean\node_modules\subtitle\dist\subtitle.cjs.development.js:242:11)
    at D:\snapshot\subclean\node_modules\subtitle\dist\subtitle.cjs.development.js:91:19
    at Array.forEach (<anonymous>)
    at Object.parseSync (D:\snapshot\subclean\node_modules\subtitle\dist\subtitle.cjs.development.js:90:44)
    at Subclean.clean (D:\snapshot\subclean\lib\index.js:188:32)
    at Subclean.init (D:\snapshot\subclean\lib\index.js:39:14)
(node:5180) UnhandledPromiseRejectionWarning: Error: Command failed: subclean "S:\tvshow\Season x\subtitle.srt" -w
D:\snapshot\subclean\node_modules\subtitle\dist\subtitle.cjs.development.js:286
      throw this.getError('timestamp', this.state.row, line);
      ^

Error: expected timestamp at row 1, but received: "��1 "
    at Parser.getError (D:\snapshot\subclean\node_modules\subtitle\dist\subtitle.cjs.development.js:222:12)
    at Parser.parseTimestamp (D:\snapshot\subclean\node_modules\subtitle\dist\subtitle.cjs.development.js:286:18)
    at Parser.parseId (D:\snapshot\subclean\node_modules\subtitle\dist\subtitle.cjs.development.js:280:12)
    at Parser.parseHeader (D:\snapshot\subclean\node_modules\subtitle\dist\subtitle.cjs.development.js:259:14)
    at Parser.parseLine (D:\snapshot\subclean\node_modules\subtitle\dist\subtitle.cjs.development.js:242:11)
    at D:\snapshot\subclean\node_modules\subtitle\dist\subtitle.cjs.development.js:91:19
    at Array.forEach (<anonymous>)
    at Object.parseSync (D:\snapshot\subclean\node_modules\subtitle\dist\subtitle.cjs.development.js:90:44)
    at Subclean.clean (D:\snapshot\subclean\lib\index.js:188:32)
    at Subclean.init (D:\snapshot\subclean\lib\index.js:39:14)

    at checkExecSyncError (child_process.js:790:11)
    at execSync (child_process.js:863:15)
    at c:\ProgramData\subclean\subclean.js:23:21
(Use `node --trace-warnings ...` to show where the warning was created)
(node:5180) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (re
jection id: 1)
(node:5180) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
DrKain commented 2 years ago

Similar issue but a bit more extreme, but I'll take a look and see if I can do something about it.
The Unicode characters at the start of the subtitle file might make this tricky. There's a limit to how much I can modify the file before I risk breaking legit subtitles

DrKain commented 2 years ago

After a frustrating amount of research and testing I've decided the character encoding in your last sample will not be supported at this time.

A simple solution would be to cut out all instances of \x00 but this would break unicode (like the music notes) or any non-english characters. The easiest solution would be to download another subtitle release.

Sorry for the inconvenience, but adding character encoding detection and conversion without conflicts would take more effort than I'm willing to commit to this project right now

DrKain commented 2 years ago

As a final note, you can convert the files manually using Notepad++ if you desire.
Simply open the file and click 'Encoding > Convert to UTF-8", then save. Subclean should work with the converted file.

1