drahnr / cargo-spellcheck

Checks all your documentation for spelling and grammar mistakes with hunspell and a nlprule based checker for grammar
Apache License 2.0
323 stars 32 forks source link

Segfault on Linux #336

Open virtualritz opened 2 months ago

virtualritz commented 2 months ago

Running cargo spellcheck on this repo of mine causes it to segfault. No other output.

I'm a bit short on time atm; otherwise I'd have built c-s from source and attached a backtrace. Sorry about that.

To Reproduce

❯ cargo install cargo-spellcheck
❯ git clone https://github.com/virtualritz/uniform-cubic-splines.git
❯ cd uniform-cubic-splines
❯ cargo spellcheck              
[1]    133893 segmentation fault (core dumped)  cargo spellcheck
drahnr commented 2 months ago

Could you provide a commit hash, such that I am not barking up the wrong tree

virtualritz commented 2 months ago

Commit hash for what? My repo or cargo-spellcheck? 😁 Neither should be needed. My repo. will not be touched in the foreseeable future and c-s was installed via cargo-install the day I filed this ticket (see above).

drahnr commented 2 months ago
0x0000555556f5c083 in mkallsmall (s="traitorous", csconv=0x0) at vendor/src/hunspell/csutil.cxx:536
536     *aI = clower(csconv, static_cast<unsigned char>(*aI));
(gdb) bt
#0  0x0000555556f5c083 in mkallsmall (s="traitorous", csconv=0x0) at vendor/src/hunspell/csutil.cxx:536
#1  0x0000555556f8041e in SuggestMgr::ngsuggest (this=0x55555892b820, wlst=std::vector of length 0, capacity 0, 
    w=<optimized out>, rHMgr=std::vector of length 2, capacity 2 = {...}, captype=captype@entry=3)
    at vendor/src/hunspell/suggestmgr.cxx:1206
#2  0x0000555556f72882 in HunspellImpl::suggest_internal (this=this@entry=0x5555589482e0, word="Catmull-Rom", 
    capwords=@0x7ffff7a1ae73: true, abbv=@0x7ffff7a1ae78: 0, captype=@0x7ffff7a1ae74: 4)
    at vendor/src/hunspell/hunspell.cxx:1198
#3  0x0000555556f74822 in HunspellImpl::suggest (this=this@entry=0x5555589482e0, word="Catmull-Rom")
    at vendor/src/hunspell/hunspell.cxx:891
#4  0x0000555556f74ffb in HunspellImpl::suggest (this=0x5555589482e0, slst=0x7ffff7a1afb8, word=0x7ffff039be40 "Catmull-Rom")
    at /usr/include/c++/13/bits/basic_string.tcc:242
#5  0x0000555556f3dae8 in hunspell_rs::Hunspell::suggest (self=0x555558ad4cc0, word=...) at src/lib.rs:53
#6  0x0000555556d2013f in cargo_spellcheck::checker::hunspell::obtain_suggestions (plain=0x7ffff7a1b740, 
    chunk=0x5555573dd440, hunspell=0x555558ad4cc0, origin=0x7ffff7a1baf0, word=..., range=..., 
    allow_concatenated=<optimized out>, allow_dashed=<optimized out>, allow_emojis=<optimized out>, acc=0x7ffff7a1b408)
    at src/checker/hunspell.rs:399
#7  0x0000555556d1dd70 in cargo_spellcheck::checker::hunspell::{impl#7}::check (self=<optimized out>, 
    origin=<optimized out>, chunks=...) at src/checker/hunspell.rs:330
#8  0x0000555556cba729 in cargo_spellcheck::checker::{impl#1}::check (self=0x7fffffffb680, origin=0x7ffff7a1baf0, chunks=...)
    at src/checker/mod.rs:112
#9  0x0000555556d0f34f in cargo_spellcheck::action::{impl#3}::run_check::{async_fn#0}::{closure#0} ()
    at src/action/mod.rs:411
#10 core::ops::function::impls::{impl#1}::call_mut<((doc_chunks::chunk::ContentOrigin, alloc::vec::Vec<doc_chunks::chunk::CheckableChunk, alloc::alloc::Global>)), cargo_spellcheck::action::{impl#3}::run_check::{async_fn#0}::{closure_env#0}> (
    args=..., self=<optimized out>) at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/ops/function.rs:272
#11 core::ops::function::impls::{impl#4}::call_once<((doc_chunks::chunk::ContentOrigin, alloc::vec::Vec<doc_chunks::chunk::CheckableChunk, alloc::alloc::Global>)), &cargo_spellcheck::action::{impl#3}::run_check::{async_fn#0}::{closure_env#0}> (
    self=<optimized out>, args=...) at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/ops/function.rs:305
#12 0x0000555556ce34ca in core::option::Option<(doc_chunks::chunk::ContentOrigin, alloc::vec::Vec<doc_chunks::chunk::CheckableChunk, alloc::alloc::Global>)>::map<(doc_chunks::chunk::ContentOrigin, alloc::vec::Vec<doc_chunks::chunk::CheckableChunk, alloc::alloc::Global>), core::result::Result<usize, eyre::Report>, &mut &cargo_spellcheck::action::{impl#3}::run_check::{async_fn#0}::{closure_env#0}> (self=<error reading variable: Cannot access memory at address 0x38>, f=<optimized out>)
    at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/option.rs:1075
#13 core::iter::adapters::map::{impl#2}::next<core::result::Result<usize, eyre::Report>, core::iter::adapters::map::Map<rayon::vec::SliceDrain<indexmap::Bucket<doc_chunks::chunk::ContentOrigin, alloc::vec::Vec<doc_chunks::chunk::CheckableChunk, alloc::alloc::Global>>>, &fn(indexmap::Bucket<doc_chunks::chunk::ContentOrigin, alloc::vec::Vec<doc_chunks::chunk::CheckableChunk, alloc::alloc::Global>>) -> (doc_chunks::chunk::ContentOrigin, alloc::vec::Vec<doc_chunks::chunk::CheckableChunk, alloc::alloc::Global>)>, &cargo_spellcheck::action::{impl#3}::run_check::{async_fn#0}::{closure_env#0}> (self=0x7ffff7a1bcc0)
    at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/iter/adapters/map.rs:108
#14 rayon::iter::plumbing::Folder::consume_iter<rayon::iter::try_fold::TryFoldFolder<rayon::iter::try_reduce::TryReduceFolder<cargo_spellcheck::action::{impl#3}::run_check::{async_fn#0}::{closure_env#3}, core::result::Result<usize, eyre::Report>>, core::result::Result<usize, eyre::Report>, cargo_spellcheck::action::{impl#3}::run_check::{async_fn#0}::{closure_env#1}>, core::result::Result<usize, eyre::Report>, core::iter::adapters::map::Map<core::iter::adapters::map::Map<rayon::vec::SliceDrain<indexmap::Bucket<doc_chunks::chunk::ContentOrigin, alloc::vec::Vec<doc_chunks::chunk::CheckableChunk, alloc::alloc::Global>>>, &fn(indexmap::Bucket<doc_chunks::chunk::ContentOrigin, alloc::vec::Vec<doc_chunks::chunk::CheckableChunk, alloc::alloc::Global>>) -> (doc_chunks::chunk::ContentOrigin, alloc::vec::Vec<doc_chunks::chunk::CheckableChunk, alloc::alloc::Global>)>, &cargo_spellcheck::action::{impl#3}::run_check::{async_fn#0}::{closure_env#0}>> (self=..., iter=...)
    at /home/bernhard/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rayon-1.10.0/src/iter/plumbing/mod.rs:177
#15 0x0000555556d1371d in rayon::iter::map::{impl#8}::consume_iter<(doc_chunks::chunk::ContentOrigin, alloc::vec::Vec<doc_chunks::chunk::CheckableChunk, alloc::alloc::Global>), core::result::Result<usize, eyre::Report>, rayon::iter::try_fold::TryFoldFolder<rayon::iter::try_reduce::TryReduceFolder<cargo_spellcheck::action::{impl#3}::run_check::{async_fn#0}::{closure_env#3}, core::result::Result<usize, eyre::Report>>, core::result::Result<usize, eyre::Report>, cargo_spellcheck::action::{impl#3}::run_check::{async_fn#0}::{closure_env#1}>, cargo_spellcheck::action::{impl#3}::run_check::{async_fn#0}::{closure_env#0}, core::iter::adapters::map::Map<rayon::vec::SliceDrain<indexmap::Bucket<doc_chunks::chunk::ContentOrigin, alloc::vec::Vec<doc_chunks::chunk::CheckableChunk, alloc::alloc::Global>>>, &fn(indexmap::Bucket<doc_chunks::chunk::ContentOrigin, alloc::vec::Vec<doc_chunks::chunk::CheckableChunk, alloc::alloc::Global>>) -> (doc_chunks::chunk::ContentOrigin, alloc::vec::Vec<doc_chunks::chunk::CheckableChunk, alloc::alloc::Global>)>> (self=..., iter=...)
..

tl;dr yet another bug in hunspell (the C library)

drahnr commented 2 months ago

So it turns out this is a multi-threading issue when calling Hunspell_suggest, with -j1 it's not reproducible. Currently there is no Mutex<_> guard the pointer to the heap allocated structure. Technically it should be re-entrant, but it appears it practically not re-entrant. I'll probably deal with this in one go with https://github.com/drahnr/cargo-spellcheck/issues/319 , using zspell as a first level and then serialize the suggestions to be created as part of the printout stream using hunspell.

This will take a few days, I am not sure why it happens at the particular line of code yet.