shuttle-hq / synth

The Declarative Data Generator
https://www.getsynth.com/
Apache License 2.0
1.37k stars 108 forks source link

CSV Import panics if encountering the `Nan` string #432

Closed moore-ryan closed 1 year ago

moore-ryan commented 1 year ago

Describe the bug When importing from CSV, if any of the CSV cells have the string Nan present, then the CSV import will panic due to attempting to unwrap an invalid conversion in csv_str_to_value

thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', synth/src/cli/csv/mod.rs:246:35
stack backtrace:
   0:        0x107ad76e2 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h439ad59891a6d893
   1:        0x107af9d2a - core::fmt::write::hc6febfea48992551
   2:        0x107ad081c - std::io::Write::write_fmt::h25befaf232726d5d
   3:        0x107ad74aa - std::sys_common::backtrace::print::he983e0dd711e6589
   4:        0x107ad9146 - std::panicking::default_hook::{{closure}}::hbd746f32dacb7b9e
   5:        0x107ad8e97 - std::panicking::default_hook::h0bd2fa66d45f4333
   6:        0x107ad9879 - std::panicking::rust_panic_with_hook::hf33328f60da8c5e5
   7:        0x107ad9602 - std::panicking::begin_panic_handler::{{closure}}::h34879240bb3c6cce
   8:        0x107ad7b78 - std::sys_common::backtrace::__rust_end_short_backtrace::h8a6940163c51aad6
   9:        0x107ad930d - _rust_begin_unwind
  10:        0x107b37a13 - core::panicking::panic_fmt::h3f49d3b1c0e9c569
  11:        0x107b37ae7 - core::panicking::panic::h68788c20ea79f43e
  12:        0x10719fe6f - synth::cli::csv::csv_str_to_value::heab7b815179200b5
  13:        0x10719f78d - synth::cli::csv::csv_record_to_value::h95aa1b53e0046e88
  14:        0x10719d7c1 - <synth::cli::csv::CsvFileImportStrategy as synth::cli::import::ImportStrategy>::import::h1950a1626ed2ba43
  15:        0x1071a5095 - synth::cli::Cli::import::hc5a488055fa40cc0
  16:        0x106fb5bfc - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::ha5e8477f1af474b7
  17:        0x106fb674f - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::heb1123417f7d8d6e
  18:        0x106fa414b - std::thread::local::LocalKey<T>::with::h294382750e1b30a6
  19:        0x106fb604f - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::he8e2d90d472094a2
  20:        0x106f9fa1a - async_io::driver::block_on::h1f2c467f27d08cd8
  21:        0x106fb1e6f - async_global_executor::executor::block_on::h66e36e0cab8e5a91
  22:        0x106fa435b - std::thread::local::LocalKey<T>::with::h3f15fa06cbc383e2
  23:        0x106fa424a - std::thread::local::LocalKey<T>::with::h38a7efab1ea8ba33
  24:        0x106fababd - async_std::task::builder::Builder::blocking::h67d992f18624c08e
  25:        0x106fa0457 - synth::main::hf938425805b84cb6
  26:        0x106fa3cb6 - std::sys_common::backtrace::__rust_begin_short_backtrace::h5bad8a310369e7d2
  27:        0x106fa3ce1 - std::rt::lang_start::{{closure}}::ha7007b1bb84b3b5a
  28:        0x107aca244 - std::rt::lang_start_internal::haa1351c7436f4ee9
  29:        0x106fa048c - _main
  30:     0x7ff81a39141f - <unknown>

To Reproduce Steps to reproduce the behavior:

  1. Example file: test/test.csv

    FirstName,LastName
    Nan,Jones
    Bob,Smith

    Run command

    synth import --from csv:/<path_to_working_dir>/test test_generated
  2. See error

    RUST_BACKTRACE=full synth import --from csv:/<path_to_working_dir>/test test_generated
    thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', synth/src/cli/csv/mod.rs:246:35
    stack backtrace:

Expected behavior CSV files which have the string Nan present in them should not panic. Depending on the context, Nan is a valid value for CSV columns (e.g. for a person's name). If the first non-header row in test.csv is changed to be Nand,Jones, then a panic no longer appears.

Environment (please complete the following information):

moore-ryan commented 1 year ago

If I have some time this weekend, I'll look at implementing a fix - should be pretty straightforward, just need to fall-through to the serde_json::Value::String case if unwrapping the float value fails.

iamwacko commented 1 year ago

Thanks for putting the time in to report an error and find the issue. If you find you don't have time I can work on this, but otherwise I'll be waiting for your PR.