mongodb / mongo-rust-driver

The official MongoDB Rust Driver
https://www.mongodb.com/docs/drivers/rust/current/
Apache License 2.0
1.44k stars 164 forks source link

stack overflow when working with big structs #1048

Closed Akronae closed 8 months ago

Akronae commented 8 months ago

Versions/Environment

  1. What version of Rust are you using? 1.75.0
  2. What operating system are you using? Ubuntu 22.04.4 LTS
  3. What versions of the driver and its dependencies are you using? (Run cargo pkgid mongodb & cargo pkgid bson) https://github.com/rust-lang/crates.io-index#mongodb@2.8.1 https://github.com/rust-lang/crates.io-index#bson@2.9.0
  4. What version of MongoDB are you using? (Check with the MongoDB shell using db.version()) 7.2.2
  5. What is your MongoDB topology (standalone, replica set, sharded cluster, serverless)? serverless

Describe the bug

The driver fails to work with big structs and panics with stack overflow error.

thread 'main' has overflowed its stack
fatal runtime error: stack overflow

To Reproduce main.rs

use mongodb::{bson::doc, Client};
use serde::{Deserialize, Serialize};

type SafeError = Box<dyn std::error::Error + Send + Sync + 'static>;

#[tokio::main]
async fn main() -> Result<(), SafeError> {
    let db =
        Client::with_uri_str("mongodb+srv://......../")
            .await?;
    let db = db.database("mydbbb");

    // ✅ working fine
    // println!("doing an insert that should work..");
    // let working = db.collection::<ManyFields<ManyFields<Option<String>>>>("working");
    // working.insert_one(ManyFields::default(), None).await?;
    // println!("insert ok");

    // ⛔ error
    println!("doing an insert that should fail");
    let error = db.collection::<ManyFields<ManyFields<ManyFields<Option<String>>>>>("error");
    error.insert_one(ManyFields::default(), None).await?;
    println!("insert ok");

    Ok(())
}

#[derive(Default, Serialize, Deserialize)]
pub struct ManyFields<T> {
    pub a: T,
    pub b: T,
    pub c: T,
    pub d: T,
    pub e: T,
    pub f: T,
    pub g: T,
    pub h: T,
    pub i: T,
    pub j: T,
    pub k: T,
    pub l: T,
    pub m: T,
    pub n: T,
    pub o: T,
    pub p: T,
    pub q: T,
    pub r: T,
    pub s: T,
    pub t: T,
    pub u: T,
    pub v: T,
    pub x: T,
    pub y: T,
    pub z: T,
}

cargo.toml

[package]
name = "mongobug"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
tokio = { version = "1.28.1", features = ["macros", "rt-multi-thread"] }
serde = "1.0.164"
mongodb = "2.8.1"

For the context, I'm working with Ancient Greek inflection tables, and it can get pretty big ([present, future, ..., aorist] x [active, indicative, ..., imperative] x [passive, middle, active] x [singular, plural] x ....). This example is overdone on purpose but keep in mind it's really easy to reach what appears to be a fields limit with real life applications.

I'd be happy to help fix this bug (if you guys confirm this is a bug), is there any onboarding I should go through?

univerz commented 8 months ago

btw, do you need to query / frequently update those fields? maybe you want to use faster/smaller format & store it as bytes inside bson?

Akronae commented 8 months ago

@univerz Yes I need to query these documents quite frequently so I need to be able to use mongo filters. For instance, in my application there would be texts where you can click on words. And if you click on for instance "said" the db will be queried for some verb which simple past first person singular declension is equal to "said" and it will return the whole inflection table with all the tenses moods etc.

Akronae commented 8 months ago

After some tests, this bug does not happen with the sync client. And is not related to the insert_one function itselft but rather the async wrapper (which I have no clues about) because when I comment out the whole function body or if I try to call any other empty function into insert_one, the same error happens. Any idea why this might happen?

Akronae commented 8 months ago

Found out the issue was related to the tokio runtime thread going out of stack memory because of all the Option enum that are then allocated on the stack. The solution was to box those like so:

// ✅ working now
let error = db.collection::<ManyFields<ManyFields<ManyFields<Box<Option<String>>>>>>("error");