BurntSushi / fst

Represent large sets and maps compactly with finite state transducers.
The Unlicense
1.76k stars 123 forks source link

`Levenshtein` doesn't work #154

Closed amab8901 closed 1 year ago

amab8901 commented 1 year ago

Inspired by the code example in this page, I wrote the following code:

use fst::{Set};
use fst::automaton::Levenshtein;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let keys = vec!["hej"];
    let set = Set::from_iter(keys)?;

    let lev = Levenshtein::new("foo", 1)?;

    println!("{:#?}", set);
    println!("{:#?}", lev);
    Ok(())
}

This code crashes when attempting to run this line:

let lev = Levenshtein::new("foo", 1)?;

Please fix the Levenshtein (and/or new) to make it stop crashing when it's being used.

EDIT: Here is the Cargo.toml:

[package]
name = "playground"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
fst = { version = "0.4.7", features = ["levenshtein"] }

How to reproduce:

  1. cargo new playground
  2. enter project's root directory
  3. cargo add fst --features=levenshtein
  4. nvim . (note: I have NerdTree extension in my Neovim)
  5. Navigate into main.rs file via NerdTree
  6. Replace the main.rs content with:
    
    use fst::{Set};
    use fst::automaton::Levenshtein;

fn main() -> Result<(), Box> { let keys = vec!["hej"]; let set = Set::from_iter(keys)?;

let lev = Levenshtein::new("foo", 1)?;

println!("{:#?}", set);
println!("{:#?}", lev);
Ok(())
7. `:w` in neovim
8. `:! cargo build` in neovim (note that bash is in project's root folder while neovim is simultaneously in `main.rs` at this point)
9. Open another terminal in project's root folder
10. In the new terminal, run `gdb target/debug/playground` (alternatively: `rust-gdb target/debug/playground`)
11. In gdb, run the following commands, one line at a time (note that I have [GDB Dashboard](https://github.com/cyrus-and/gdb-dashboard) implemented into my GDB):

list b 8 b 10 list b 11 b 12 r c

It crashes as soon as you enter `c`. The following specified what "crash" means and shows the output that I see:

Breakpoint 2, playground::main () at src/main.rs:10 10 println!("{:#?}", set); ../../gdb/gdbtypes.h:1064: internal-error: field: Assertion `idx >= 0 && idx < num_fields ()' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. ----- Backtrace ----- 0x5654d0133b6b ??? 0x5654d0471d44 ??? 0x5654d0534ce3 ??? 0x5654d00b3ea1 ??? 0x5654d0483e23 ??? 0x5654d03b74bc ??? 0x5654d047f0db ??? 0x5654d0484935 ??? 0x5654d0485b6b ??? 0x5654d03b70ab ??? 0x5654d047f0db ??? 0x5654d03b6cd0 ??? 0x5654d03b70ef ??? 0x5654d047f0db ??? 0x5654d036ed08 ??? 0x7f96c5548418 ??? 0x7f96c54f9c9f ??? 0x7f96c54f4b83 ??? 0x7f96c5529420 ??? 0x7f96c54e4b33 ??? 0x5654d03632a6 ??? 0x5654d022a526 ??? 0x5654d047f164 ??? 0x5654d03b6cd0 ??? 0x5654d03b70ef ??? 0x5654d047f0db ??? 0x5654d03b6cd0 ??? 0x5654d03b70ef ??? 0x5654d047f0db ??? 0x5654d036ed08 ??? 0x7f96c5548418 ??? 0x7f96c54f9c9f ??? 0x7f96c54f4b83 ??? 0x7f96c5500bb8 ??? 0x7f96c54efd75 ??? 0x7f96c5500bb8 ??? 0x7f96c54efd75 ??? 0x7f96c5500bb8 ??? 0x7f96c54f48de ??? 0x7f96c5500bb8 ??? 0x7f96c54f48de ??? 0x7f96c5500bb8 ??? 0x7f96c54f01a9 ??? 0x7f96c550be45 ??? 0x7f96c54f0eb5 ??? 0x7f96c54ee9bf ??? 0x7f96c550bf6c ??? 0x7f96c5500357 ??? 0x7f96c559dcc0 ??? 0x5654d05b11ba ??? 0x5654d03617f2 ??? 0x5654d02bc432 ??? 0x5654d02b3f52 ??? 0x5654d0535955 ??? 0x5654d0535d06 ??? 0x5654d02f3d24 ??? 0x5654d009fc04 ??? 0x7f96c4c3a28f ??? 0x7f96c4c3a349 ??? 0x5654d00a61e4 ??? 0xffffffffffffffff ???

../../gdb/gdbtypes.h:1064: internal-error: field: Assertion `idx >= 0 && idx < num_fields ()' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n)

BurntSushi commented 1 year ago

This bug report is incomplete. It doesn't include your Cargo.toml and it doesn't include the actual output you see or even specify what "crash" means. Indeed, I cannot reproduce this:

$ cat Cargo.toml
[package]
name = "i154"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
fst = { version = "0.4.7", features = ["levenshtein"] }

$ cat src/main.rs
use fst::{Set};
use fst::automaton::Levenshtein;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let keys = vec!["hej"];
    let set = Set::from_iter(keys)?;

    let lev = Levenshtein::new("foo", 1)?;

    println!("{:#?}", set);
    println!("{:#?}", lev);
    Ok(())
}

$ cargo run --release
   Compiling fst v0.4.7
   Compiling utf8-ranges v1.0.5
   Compiling i154 v0.1.0 (/home/andrew/tmp/issues/fst/i154)
    Finished release [optimized] target(s) in 1.60s
     Running `target/release/i154`
Set([hej])
Levenshtein(query: "foo", distance: 1)
amab8901 commented 1 year ago

This bug report is incomplete. It doesn't include your Cargo.toml and it doesn't include the actual output you see or even specify what "crash" means. Indeed, I cannot reproduce this:

$ cat Cargo.toml
[package]
name = "i154"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
fst = { version = "0.4.7", features = ["levenshtein"] }

$ cat src/main.rs
use fst::{Set};
use fst::automaton::Levenshtein;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let keys = vec!["hej"];
    let set = Set::from_iter(keys)?;

    let lev = Levenshtein::new("foo", 1)?;

    println!("{:#?}", set);
    println!("{:#?}", lev);
    Ok(())
}

$ cargo run --release
   Compiling fst v0.4.7
   Compiling utf8-ranges v1.0.5
   Compiling i154 v0.1.0 (/home/andrew/tmp/issues/fst/i154)
    Finished release [optimized] target(s) in 1.60s
     Running `target/release/i154`
Set([hej])
Levenshtein(query: "foo", distance: 1)

I added answers to your questions in my post

BurntSushi commented 1 year ago

Wow... okay. So the crash is only happening when running under gdb. That's... kind of an important detail to have left out of the original bug report!

But anyway, gdb works just fine for me:

$ gdb ./target/debug/i154                                                                                                                                           
GNU gdb (GDB) 12.1                                                                                                                                                                     
Copyright (C) 2022 Free Software Foundation, Inc.                                                                                                                                      
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>                                                                                                          
This is free software: you are free to change and redistribute it.                                                                                                                     
There is NO WARRANTY, to the extent permitted by law.                                                                                                                                  
Type "show copying" and "show warranty" for details.                                                                                                                                   
This GDB was configured as "x86_64-pc-linux-gnu".                                                                                                                                      
Type "show configuration" for configuration details.                                                                                                                                   
For bug reporting instructions, please see:                                                                                                                                            
<https://www.gnu.org/software/gdb/bugs/>.                                                                                                                                              
Find the GDB manual and other documentation resources online at:                                                                                                                       
    <http://www.gnu.org/software/gdb/documentation/>.                                                                                                                                  

For help, type "help".                                                                                                                                                                 
Type "apropos word" to search for commands related to "word"...                                                                                                                        
Reading symbols from ./target/debug/i154...                                                                                                                                            
warning: Missing auto-load script at offset 0 in section .debug_gdb_scripts                                                                                                            
of file /home/andrew/tmp/issues/fst/i154/target/debug/i154.                                                                                                                            
Use `info auto-load python-scripts [REGEXP]' to list them.                                                                                                                             
(gdb) list                                                                                                                                                                             
1       use fst::{Set};                                                                                                                                                                
2       use fst::automaton::Levenshtein;                                                                                                                                               
3                                                                                                                                                                                      
4       fn main() -> Result<(), Box<dyn std::error::Error>> {                                                                                                                          
5           let keys = vec!["hej"];                                                                                                                                                    
6           let set = Set::from_iter(keys)?;                                                                                                                                           
7                                                                                                                                                                                      
8           let lev = Levenshtein::new("foo", 1)?;                                                                                                                                     
9                                                                                                                                                                                      
10          println!("{:#?}", set);                                                                                                                                                    
(gdb) b 8
Breakpoint 1 at 0xfcc5: file src/main.rs, line 8.
(gdb) b 10
Breakpoint 2 at 0xfe7b: file src/main.rs, line 10.
(gdb) list
11          println!("{:#?}", lev);
12          Ok(())
13      }
(gdb) b 11
Breakpoint 3 at 0xff8c: file src/main.rs, line 11.
(gdb) b 12
Breakpoint 4 at 0x10004: file src/main.rs, line 12.
(gdb) r
Starting program: /home/andrew/tmp/issues/fst/i154/target/debug/i154 

This GDB supports auto-downloading debuginfo from the following URLs:
https://debuginfod.archlinux.org 
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
Downloading 1.07 MB separate debug info for /lib64/ld-linux-x86-64.so.2
Downloading 9.93 MB separate debug info for /usr/lib/libc.so.6                                                                                                                        
[Thread debugging using libthread_db enabled]                                                                                                                                         
Using host libthread_db library "/usr/lib/libthread_db.so.1".

Breakpoint 1, i154::main () at src/main.rs:8
8           let lev = Levenshtein::new("foo", 1)?;
(gdb) c
Continuing.

Breakpoint 2, i154::main () at src/main.rs:10
10          println!("{:#?}", set);
(gdb) c
Continuing.
Set([hej])

Breakpoint 3, i154::main () at src/main.rs:11
11          println!("{:#?}", lev);
(gdb) c
Continuing.
Levenshtein(query: "foo", distance: 1)

Breakpoint 4, i154::main () at src/main.rs:12
12          Ok(())
(gdb) c
Continuing.
[Inferior 1 (process 232651) exited normally]
(gdb) quit

Note also the actual error message you're getting:

../../gdb/gdbtypes.h:1064: internal-error: field: Assertion `idx >= 0 && idx < num_fields ()' failed.
A problem internal to GDB has been detected,

So this looks like a problem with gdb.