kaegi / alass

"Automatic Language-Agnostic Subtitle Synchronization"
GNU General Public License v3.0
994 stars 52 forks source link

Error while decoding subtitle #31

Closed Ruke805 closed 3 years ago

Ruke805 commented 3 years ago

First, good job for this tool

I've tried with a subtitle and I got this error:

error: parsing subtitle file 'F:\Ani\subtitle.srt' failed
caused by: error while decoding subtitle from bytes to string (wrong charset encoding?)

not: run with environment variable 'RUST_BACKTRACE=1' for detailed stack traces

What's going wrong?

kaegi commented 3 years ago

Usually text files (and special characters in it) are encoded as UTF-8, but sometimes files have other encodings which cannot be decoded using the default UTF-8 decoder. The lastest commit https://github.com/kaegi/alass/commit/874f02d9577182752a0f969b6d6b98fd65bdf1fc in this project actually implements auto-detection of the character encoding.

The most commonly used incompatible encoding is iso-8859-1. You can specify the file encoding for the "incorrect file" with --encoding-inc=iso-8859-1 or for the "reference file" with --encoding-ref=iso-8859-1.

kaegi commented 3 years ago

On linux/unix machines the tool chardetect can guess the encoding of the file (this is what the latest commit does automatically).

Related to https://github.com/kaegi/alass/issues/25.

Ruke805 commented 3 years ago

Nice, It worked well adding this flag