Qovery / Replibyte

Seed your development database with real data ⚡️
https://www.replibyte.com
GNU General Public License v3.0
4.17k stars 129 forks source link

Segfault running tests on macos #256

Open lukeasrodgers opened 1 year ago

lukeasrodgers commented 1 year ago

running cargo test --all on the current commit (d6b35a7) i get the following output

warning: unused variable: `e`
  --> dump-parser/src/utils.rs:76:25
   |
76 |                     Err(e) => continue
   |                         ^ help: if this is intentional, prefix it with an underscore: `_e`
   |
   = note: `#[warn(unused_variables)]` on by default

warning: variable does not need to be mutable
  --> dump-parser/src/utils.rs:66:13
   |
66 |         let mut query_res = ListQueryResult::Continue;
   |             ----^^^^^^^^^
   |             |
   |             help: remove this `mut`
   |
   = note: `#[warn(unused_mut)]` on by default

warning: function `parse_quoted_ident` is never used
   --> dump-parser/src/postgres/mod.rs:658:4
    |
658 | fn parse_quoted_ident(chars: &mut Peekable<Chars<'_>>, quote_end: char) -> (String, Option<char>) {
    |    ^^^^^^^^^^^^^^^^^^
    |
    = note: `#[warn(dead_code)]` on by default

warning: constant `COMMENT_CHARS` is never used
 --> dump-parser/src/utils.rs:7:7
  |
7 | const COMMENT_CHARS: &str = "--";
  |       ^^^^^^^^^^^^^

warning: fields `start_index` and `end_index` are never read
   --> dump-parser/src/utils.rs:145:5
    |
144 | struct CommentStatement<'a> {
    |        ---------------- fields in this struct
145 |     start_index: usize,
    |     ^^^^^^^^^^^
146 |     end_index: usize,
    |     ^^^^^^^^^

warning: fields `start_index` and `end_index` are never read
   --> dump-parser/src/utils.rs:152:5
    |
150 | struct QueryStatement<'a> {
    |        -------------- fields in this struct
151 |     valid: bool,
152 |     start_index: usize,
    |     ^^^^^^^^^^^
153 |     end_index: usize,
    |     ^^^^^^^^^

warning: `dump-parser` (lib) generated 6 warnings
warning: variable does not need to be mutable
   --> subset/src/postgres.rs:188:9
    |
188 |         mut progress: P,
    |         ----^^^^^^^^
    |         |
    |         help: remove this `mut`
    |
    = note: `#[warn(unused_mut)]` on by default

warning: `subset` (lib) generated 1 warning
warning: unused variable: `tokens`
   --> dump-parser/src/postgres/mod.rs:922:13
    |
922 |         let tokens = tokens_result.unwrap();
    |             ^^^^^^ help: if this is intentional, prefix it with an underscore: `_tokens`
    |
    = note: `#[warn(unused_variables)]` on by default

warning: unused variable: `expected`
   --> dump-parser/src/postgres/mod.rs:924:13
    |
924 |         let expected: Vec<Token> = vec![];
    |             ^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_expected`

warning: unused variable: `tokens`
   --> dump-parser/src/postgres/mod.rs:947:13
    |
947 |         let tokens = tokens_result.unwrap();
    |             ^^^^^^ help: if this is intentional, prefix it with an underscore: `_tokens`

warning: unused variable: `expected`
   --> dump-parser/src/postgres/mod.rs:949:13
    |
949 |         let expected: Vec<Token> = vec![];
    |             ^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_expected`

warning: unused variable: `tokens`
   --> dump-parser/src/postgres/mod.rs:965:13
    |
965 |         let tokens = tokens_result.unwrap();
    |             ^^^^^^ help: if this is intentional, prefix it with an underscore: `_tokens`

warning: unused variable: `expected`
   --> dump-parser/src/postgres/mod.rs:967:13
    |
967 |         let expected: Vec<Token> = vec![];
    |             ^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_expected`

warning: unused variable: `e`
  --> dump-parser/src/utils.rs:76:25
   |
76 |                     Err(e) => continue
   |                         ^ help: if this is intentional, prefix it with an underscore: `_e`

warning: unused `Result` that must be used
   --> dump-parser/src/utils.rs:323:9
    |
323 | /         list_sql_queries_from_dump_reader(reader, |query| {
324 | |             queries.push(query.to_string());
325 | |             ListQueryResult::Continue
326 | |         });
    | |___________^
    |
    = note: this `Result` may be an `Err` variant, which should be handled
    = note: `#[warn(unused_must_use)]` on by default

warning: `subset` (lib test) generated 1 warning (1 duplicate)
warning: `dump-parser` (lib test) generated 13 warnings (5 duplicates)
warning: unused import: `std::path::Path`
 --> replibyte/src/datastore/local_disk.rs:3:5
  |
3 | use std::path::Path;
  |     ^^^^^^^^^^^^^^^
  |
  = note: `#[warn(unused_imports)]` on by default

warning: unused import: `Read`
 --> replibyte/src/datastore/local_disk.rs:2:33
  |
2 | use std::io::{BufReader, Error, Read, Write};
  |                                 ^^^^

warning: unused import: `Write`
 --> replibyte/src/datastore/local_disk.rs:2:39
  |
2 | use std::io::{BufReader, Error, Read, Write};
  |                                       ^^^^^

warning: unused variable: `database_name`
   --> replibyte/src/source/postgres.rs:313:17
    |
313 |                 database_name,
    |                 ^^^^^^^^^^^^^ help: try ignoring the field: `database_name: _`
    |
    = note: `#[warn(unused_variables)]` on by default

warning: unused variable: `original_query`
   --> replibyte/src/source/postgres.rs:586:41
    |
586 |         assert!(p.read(source_options, |original_query, query| {}).is_ok());
    |                                         ^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_original_query`

warning: unused variable: `query`
   --> replibyte/src/source/postgres.rs:586:57
    |
586 |         assert!(p.read(source_options, |original_query, query| {}).is_ok());
    |                                                         ^^^^^ help: if this is intentional, prefix it with an underscore: `_query`

warning: unused variable: `original_query`
   --> replibyte/src/source/postgres.rs:598:41
    |
598 |         assert!(p.read(source_options, |original_query, query| {}).is_err());
    |                                         ^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_original_query`

warning: unused variable: `query`
   --> replibyte/src/source/postgres.rs:598:57
    |
598 |         assert!(p.read(source_options, |original_query, query| {}).is_err());
    |                                                         ^^^^^ help: if this is intentional, prefix it with an underscore: `_query`

warning: unused variable: `execution_time_in_millis`
  --> replibyte/src/telemetry.rs:72:9
   |
72 |         execution_time_in_millis: Option<u128>,
   |         ^^^^^^^^^^^^^^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_execution_time_in_millis`

warning: variable does not need to be mutable
   --> replibyte/src/datastore/s3.rs:227:9
    |
227 |         mut data_callback: &mut dyn FnMut(Bytes),
    |         ----^^^^^^^^^^^^^
    |         |
    |         help: remove this `mut`
    |
    = note: `#[warn(unused_mut)]` on by default

warning: variable does not need to be mutable
  --> replibyte/src/tasks/full_restore.rs:42:9
   |
42 |         mut self,
   |         ----^^^^
   |         |
   |         help: remove this `mut`

warning: enum `ConnectorConfig` is never used
  --> replibyte/src/config.rs:27:10
   |
27 | pub enum ConnectorConfig<'a> {
   |          ^^^^^^^^^^^^^^^
   |
   = note: `#[warn(dead_code)]` on by default

warning: associated function `connector` is never used
  --> replibyte/src/config.rs:33:12
   |
33 |     pub fn connector(&self) -> Result<ConnectorConfig, Error> {
   |            ^^^^^^^^^

warning: associated function `capture_batch` is never used
  --> replibyte/src/telemetry.rs:60:12
   |
60 |     pub fn capture_batch(&self, events: Vec<Event>) -> Result<(), Error> {
   |            ^^^^^^^^^^^^^

warning: associated function `name` is never used
  --> replibyte/src/types.rs:43:12
   |
43 |     pub fn name(&self) -> &str {
   |            ^^^^

warning: associated function `char_value` is never used
  --> replibyte/src/types.rs:75:12
   |
75 |     pub fn char_value(&self) -> Option<&char> {
   |            ^^^^^^^^^^

warning: associated function `boolean_value` is never used
  --> replibyte/src/types.rs:82:12
   |
82 |     pub fn boolean_value(&self) -> Option<&bool> {
   |            ^^^^^^^^^^^^^

warning: `replibyte` (bin "replibyte" test) generated 17 warnings
    Finished test [unoptimized + debuginfo] target(s) in 0.42s
     Running unittests src/lib.rs (target/debug/deps/dump_parser-c80d3e3fc0351a58)

running 26 tests
test mysql::tests::test_tokenize_single_quoted_string ... ok
test mysql::tests::test_get_column_names_from_insert_into_query ... ok
test mysql::tests::test_insert_into_with_boolean_column_type ... ok
test mysql::tests::test_insert_into_with_numbers ... ok
test mysql::tests::test_match_keyword_at_position ... ok
test mysql::tests::test_get_column_values_from_insert_into_query ... ok
test mysql::tests::tokenizer_for_insert_into_query ... ok
test postgres::tests::test_get_column_names_from_insert_into_query ... ok
test mysql::tests::tokenizer_for_create_table_query ... ok
test mysql::tests::tokenize_insert_into_with_special_chars ... ok
test postgres::tests::test_get_column_values_from_insert_into_query ... ok
test postgres::tests::test_insert_into_with_numbers ... ok
test postgres::tests::test_insert_into_with_boolean_column_type ... ok
test postgres::tests::tokenizer_for_create_table_query ... ok
test mongodb::tests::mongo_archive_parsing ... ok
test postgres::tests::tokenizer_for_insert_query ... ok
test postgres::tests::tokenizer_for_copy_from_stdin_query ... ok
test utils::tests::check_create_or_replace_function ... ok
test postgres::tests::tokenizer_for_create_table_2 ... ok
test mongodb::tests::mongo_archive_to_bytes ... ok
test mysql::tests::test_get_single_quoted_string_value_at_position ... ok
test utils::tests::check_list_sql_statements_with_multiple_lines ... ok
test utils::tests::check_list_sql_queries_from_dump_reader ... ok
test utils::tests::check_list_sql_statements ... ok
test utils::tests::check_multiple_sql_statements ... ok
test utils::tests::check_query_line_with_comment_at_the_end ... ok

test result: ok. 26 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running unittests src/main.rs (target/debug/deps/replibyte-4b62a92fada97cd5)

running 76 tests
test config::tests::substitute_env_variables ... ok
test config::tests::parse_postgres_connection_uri_with_db ... ok
test config::tests::parse_mongodb_connection_uri_with_db ... ok
test config::tests::parse_postgres_connection_uri_with_username_with_special_chars_db ... ok
test config::tests::parse_mysql_connection_uri_with_db ... ok
test config::tests::parse_mongodb_connection_uri ... ok
test config::tests::parse_mysql_connection_uri ... ok
test config::tests::parse_postgres_connection_uri ... ok
test datastore::local_disk::tests::test_dump_name ... ok
test datastore::local_disk::tests::init_local_disk ... ok
test datastore::tests::test_compression ... ok
test datastore::local_disk::tests::test_index_file ... ok
test datastore::tests::test_encryption_1 ... ok
test datastore::tests::test_encryption_2 ... ok
error: test failed, to rerun pass `-p replibyte --bin replibyte`

Caused by:
  process didn't exit successfully: `/Users/luke/workspace/Replibyte/target/debug/deps/replibyte-4b62a92fada97cd5` (signal: 11, SIGSEGV: invalid memory reference)

Using git bisect, looks like the first "bad" commit is 7714dd0a119d9246d4eae0f13130f6d4ce1fd997. I don't know rust well enough to be able to debug this any further. It seems as though the ability to run tests isn't actually impacted and the error is just noise, but the segfault is still somewhat concerning.

-- update

Actually it does seem that this segfault impacts the ability to run all tests. If I introduce artificial failures by e.g. changing assert_ne to assert_eq in some transformer tests, the test output is the same and no error is reported for the transformer tests. As you can see they are not mentioned in the test output at all.

lukeasrodgers commented 1 year ago

I think I have narrowed this down to the use of mongodb-schema-parser.

If I

If I only modify read_and_parse_schema so that it doesn't use SchemaParser I still get the segfault. This is surprising since I expected that use in rust would be side-effect free (or at very least, free of side effects causing memory corruption!). I do see that the mongo schema parser appears to initialize a static allocator object here https://github.com/mongodb-rust/mongodb-schema-parser/blob/master/src/lib.rs#L74 but am not familiar with the wee library or the related wasm code.