Qovery / Replibyte

Seed your development database with real data ⚡️
https://www.replibyte.com
GNU General Public License v3.0
4.17k stars 129 forks source link

thread 'main' panicked at 'byte index 3 is not a char boundary; it is inside '\u{a0}' (bytes 2..4) of `*****`', replibyte/src/transformer/redacted.rs:87:30 #275

Closed ohmer closed 1 year ago

ohmer commented 1 year ago

Hi there,

I encountered the issue while trying to dump a MySQL database. Is this a known bug? Am I doing something wrong?

$ RUST_BACKTRACE=1 replibyte --no-telemetry --config config.yml dump create --name 20230709
thread 'main' panicked at 'byte index 3 is not a char boundary; it is inside '\u{a0}' (bytes 2..4) of `<REDACTED>`', replibyte/src/transformer/redacted.rs:87:30
stack backtrace:
   0: rust_begin_unwind
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/panicking.rs:143:14
   2: core::str::slice_error_fail
   3: <replibyte::transformer::redacted::RedactedTransformer as replibyte::transformer::Transformer>::transform
   4: replibyte::source::mysql::transform_columns
   5: replibyte::source::mysql::read_and_transform::{{closure}}
   6: dump_parser::utils::list_sql_queries_from_dump_reader
   7: <replibyte::source::mysql::Mysql as replibyte::source::Source>::read
   8: <replibyte::tasks::full_dump::FullDumpTask<S> as replibyte::tasks::Task>::run
   9: replibyte::commands::dump::run
  10: replibyte::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Content of config.yml:

datastore:
  aws:
    bucket: <REDACTED>
    region: <REDACTED>
source:
  connection_uri: mysql://<REDACTED>:<REDACTED>@localhost/<REDACTED>
  only_tables:
    - database: <REDACTED>
      table: account
    - database: <REDACTED>
      table: bank_account
  transformers:
    - database: <REDACTED>
      table: account
      columns:
        - name: email
          transformer_name: email
        - name: password
          transformer_name: redacted
        - name: password_token
          transformer_name: redacted
    - database: <REDACTED>
      table: bank_account
      columns:
        - name: bank_account_no
          transformer_name: random
        - name: bank_account_name
          transformer_name: redacted

Other system information:

$ mysqldump --version
mysqldump  Ver 10.13 Distrib 5.6.51-91.0, for debian-linux-gnu (x86_64)
$ mysqld --version
mysqld  Ver 5.6.51-91.0 for debian-linux-gnu on x86_64 (Percona Server (GPL), Release 91.0, Revision b59139e)

Any help appreciated ;-)

pepoviola commented 1 year ago

Hi @ohmer, this looks like an issue handle multi-byte chars. @evoxmusic this panic because we are indexing on the value, if you are agree I can fix this and handle the multi-byte correctly.

Thx!

evoxmusic commented 1 year ago

Hi @pepoviola , it's good to see you. That would be so awesome. Then I can review and validate

pepoviola commented 1 year ago

Hi @pepoviola , it's good to see you. That would be so awesome. Then I can review and validate

Hi @evoxmusic, nice to see you too! I just create this pr fixing the issue. Thx!