VirusTotal / yara-x

A rewrite of YARA in Rust.
https://virustotal.github.io/yara-x/
BSD 3-Clause "New" or "Revised" License
565 stars 46 forks source link

Add a command-line option for automatic encoding conversion #109

Open plusvic opened 1 month ago

plusvic commented 1 month ago

When a source file contains an invalid UTF-8 character, YARA-X fails with an error like this:

error: invalid UTF-8
 --> test.yar:3:19
  |
3 |     author = "John Smith � "
  |                          ^ invalid UTF-8 character
  |

By using the chardetng and encoding_rs crates, the encoding of the original source file could be automatically detected and then converted to UTF-8, before the source code is passed to the parser.

This automatic encoding conversion would be performed only when the --force-utf-8 option is passed to the CLI.