ronniec95 / xladd-derive

Macros to help write Excel User defined functions easily in Rust
MIT License
18 stars 1 forks source link

Async function crashing #6

Open vron opened 2 years ago

vron commented 2 years ago

Hi,

Re-opening as a new issue. I am trying to debug this as well but have a limited understanding of the codebase so though I should report it clearly:

Running:

    use xladd::registrator::Reg;
    use xladd::variant::Variant;
    use xladd::xlcall::LPXLOPER12;
    use xladd_derive::xl_func;

    #[xl_func(category = "Test", async = "true")]
    fn ra_send(json: String) -> Result<Variant, Box<dyn std::error::Error>> {
        let client = reqwest::blocking::Client::new();
        let resp = client
            .post("http://localhost:8080/post")
            .header(reqwest::header::CONTENT_TYPE, "application/json")
            .body(json)
            .send();

        Ok(Variant::from(11.0))
    }

    #[no_mangle]
    pub extern "stdcall" fn xlAutoOpen() -> i32 {
        let r = Reg::new();
        register_ra_send(&r);
        1
    }

And then puttting =ra_send("") WITHOUT any server running on localhost:8080 will randomly crash excel - somethimes the cell takes the value 11

vron commented 2 years ago

(easiest way to trigger a crash is not copy the sell to seveal cells and force a re-calculate)

ronniec95 commented 2 years ago

Have you tried without async to narrow it down. Note that Excel calls your code in multiple threads without the protection of Rust. That could be causing the issue?

vron commented 2 years ago

Trying to debug now - I can't see that being the problem - the code above is literally the only code i run - it should be completely thread safe..

vron commented 2 years ago

Yes - so so far I have not managed to get it to crash if I write exactly the same code but without the async="true" (though it becomes very slow to re-calc since i blocks the calc threads

vron commented 2 years ago

And as a further point this (without any http request) also crashes:

    #[xl_func(category = "Test", async = "true")]
    fn ra_send(json: String) -> Result<Variant, Box<dyn std::error::Error>> {
        let client = reqwest::blocking::Client::new();
        std::thread::sleep_ms(500);
        Ok(Variant::from(11.0))
    }
vron commented 2 years ago

Right, I'm struggling to understand what's going on here.

I have one possible theory - but not one that I belive a lot in:

Expanding the code generated by the proc-macro ( ) I get the following "core" of the function:

let raw_ptr = xladd::variant::XLOPERPtr(return_handle);
std::thread::spawn(move || match ra_send(json) {
        Ok(v) => {
            {
                let lvl = ::log::Level::Trace;
                if lvl <= ::log::STATIC_MAX_LEVEL && lvl <= ::log::max_level() {
                    ::log::__private_api_log(
                        ::core::fmt::Arguments::new_v1(
                            &["Results [", "]"],
                            &match (&v,) {
                                (arg0,) => [::core::fmt::ArgumentV1::new(
                                    arg0,
                                    ::core::fmt::Debug::fmt,
                                )],
                            },
                        ),
                        lvl,
                        &(
                            "raxll::exports",
                            "raxll::exports",
                            "bin\\raxll\\src\\exports\\mod.rs",
                            16u32,
                        ),
                    );
                }
            };
            xladd::entrypoint::excel12(
                xladd::xlcall::xlAsyncReturn,
                &mut [Variant::from(raw_ptr), v],
            );
        }

Showing that we retain the XLOPER struct we got as a handle as an input (raw_ptr) after the return of the outer function called by xll (i.e we move it to our closure).

MS documentation seems to be unclear on this - but generally says that we cannot retain arguments after returning (or is the handle special?) - or could the problem be that we need to deep copy the handler and that currenty exccel randomly does free it before we returh from our async code?

ronniec95 commented 2 years ago

I agree that we generally should not hold on to arguments after the call but the handle is special. The order of work is:

Call a no-op function that holds a pointer (R) in which to populate the results. Excel keeps a reference to R Start the work in a separate thread (T), and pass in R When T completes, populate the results in R

It makes sense and is probably how I would implement it.

vron commented 2 years ago

I agree completely - just looking for possible reasons for that simple eaxmple to crash

Starting to suspect some of the logging or error-handling instead - I have reduced the generated async code to bare minimum as below - this one does NOT seem to crash excal (what I have seen thus far at least - could also be it becomes more rare ofcourse):

    use xladd::registrator::Reg;
    use xladd::variant::Variant;
    use xladd::xlcall::LPXLOPER12;
    use xladd_derive::xl_func;

    fn _error_hndlr_ra_send(
        json: Variant,
        return_handle: LPXLOPER12,
    ) -> Result<Variant, Box<dyn std::error::Error>> {
        let json = std::convert::TryInto::<String>::try_into(&json)?;
        let raw_ptr = xladd::variant::XLOPERPtr(return_handle);

        std::thread::spawn(move || match ra_send(json) {
            Ok(v) => {
                xladd::entrypoint::excel12(
                    xladd::xlcall::xlAsyncReturn,
                    &mut [Variant::from(raw_ptr), v],
                );
            }
            Err(e) => {
                xladd::entrypoint::excel12(
                    xladd::xlcall::xlAsyncReturn,
                    &mut [Variant::from(raw_ptr), Variant::from(e.to_string())],
                );
            }
        });
        Ok(Variant::default())
    }
    #[no_mangle]
    pub extern "stdcall" fn xl_ra_send(json: LPXLOPER12, return_handle: LPXLOPER12) {
        let json = Variant::from(json);
        match _error_hndlr_ra_send(json, return_handle) {
            Ok(_) => (),
            Err(e) => {
                let raw_ptr = xladd::variant::XLOPERPtr(return_handle);
                xladd::entrypoint::excel12(
                    xladd::xlcall::xlAsyncReturn,
                    &mut [Variant::from(raw_ptr), Variant::from(e.to_string())],
                );
            }
        }
    }
    pub(crate) fn register_ra_send(reg: &xladd::registrator::Reg) {
        reg.add("xl_ra_send", ">QX$", "json", "Test", " and ", &[]);
    }
    fn ra_send(json: String) -> Result<Variant, Box<dyn std::error::Error>> {
        let client = reqwest::blocking::Client::new();
        std::thread::sleep_ms(500);
        Ok(Variant::from(11.0))
    }
    #[no_mangle]
    pub extern "stdcall" fn xlAutoOpen() -> i32 {
        let r = Reg::new();
        register_ra_send(&r);
        1
    }
ronniec95 commented 2 years ago

That's really intriguing if it's related to logging. The logging interface is thread safe within Rust; but you may be using a specific logger? Can you tell me which one? Maybe that's why I can't reproduce the behaviour on my side?

I have seen that simple_logger::log_to_stdout() does crash (in other code, not excel related) when used in multiple threads

vron commented 2 years ago

Well I have stderrlog = "0.5" in one Cargo.tompl file in another crate i import - I can try to remove that import and see if that solves the problem

vron commented 2 years ago

(to be clear - im not use:ing it directly in thes crate or configuring it or anyhting

vron commented 2 years ago

So I've managed to get it to reliably crash without including any logging package.

The attached stand-alone crate reliably crashes for me by:

  1. cargo build
  2. rename the generated dll to .xll
  3. Add the addin to excel
  4. in 100 cells (10*10) add the equation =xl_ra_send(A1)
  5. changng the string in A1 a few times

crashing.zip

(no server running on 8080)

vron commented 2 years ago

As a side note - if I create a async function like this:

    #[xl_func(category = "Test", async = "true")]
    fn ra_send(json: String) -> Result<Variant, Box<dyn std::error::Error>> {
        panic!("asdf");
    }

That will not crash Excel - the cells will (logically) be stich in #CALC! but it will not crash excel - so it must somehow be the return value that is bad in the other case? Or is something creating undefined behaviour that's crates a result worse than a panic...

ronniec95 commented 2 years ago

OK I'll do an investigation. Thanks for highlighting this bug. I will keep you posted

vron commented 2 years ago

:-) Let me know if you do not manage to re-create it

vron commented 2 years ago

One more updat from me before I'll shut up and stop making so much noice - I now have a even smaller example that crashes without any import of the request module:

lib.rs:

    use xladd::registrator::Reg;
    use xladd::variant::Variant;
    use xladd::xlcall::LPXLOPER12;

    fn _error_hndlr_ra_send(
        json: Variant,
        return_handle: LPXLOPER12,
    ) -> Result<Variant, Box<dyn std::error::Error>> {
        let json = std::convert::TryInto::<String>::try_into(&json)?;
        let raw_ptr = xladd::variant::XLOPERPtr(return_handle);

        std::thread::spawn(move || match ra_send(json) {
            Ok(v) => {
                xladd::entrypoint::excel12(
                    xladd::xlcall::xlAsyncReturn,
                    &mut [Variant::from(raw_ptr), v],
                );
            }
            Err(e) => {
                xladd::entrypoint::excel12(
                    xladd::xlcall::xlAsyncReturn,
                    &mut [Variant::from(raw_ptr), Variant::from(e.to_string())],
                );
            }
        });
        Ok(Variant::default())
    }
    #[no_mangle]
    pub extern "stdcall" fn xl_ra_send(json: LPXLOPER12, return_handle: LPXLOPER12) {
        let json = Variant::from(json);
        match _error_hndlr_ra_send(json, return_handle) {
            Ok(_) => (),
            Err(e) => {
                let raw_ptr = xladd::variant::XLOPERPtr(return_handle);
                xladd::entrypoint::excel12(
                    xladd::xlcall::xlAsyncReturn,
                    &mut [Variant::from(raw_ptr), Variant::from(e.to_string())],
                );
            }
        }
    }
    pub(crate) fn register_ra_send(reg: &xladd::registrator::Reg) {
        reg.add("xl_ra_send", ">QX$", "json", "Test", " and ", &[]);
    }
    fn ra_send(json: String) -> Result<Variant, Box<dyn std::error::Error>> {
        Ok(Variant::from(format!("{:?}", resp)))*/
        std::thread::sleep_ms(250);
        Ok(Variant::from(44.4))
    }
    #[no_mangle]
    pub extern "stdcall" fn xlAutoOpen() -> i32 {
        let r = Reg::new();
        register_ra_send(&r);
        1
    }

Cargo.toml:

    [package]
    name = "raxll"
    version = "0.1.0"
    authors = ["V"]
    edition = "2018"

    [dependencies]
    xladd-derive= "^0.6"
    xladd = {git="https://github.com/ronniec95/xladd" , features=["use_ndarray"] } # Needed to patch the old abandoned crate
    log = "0"

    [lib]
    name = "raxll"
    crate-type = ["cdylib"]

And then as before the 10x10 grid in excel and changing the input a few times will create a crash (seems to occure faster if changing the input before the cells are finished calculating)

ronniec95 commented 2 years ago

I've been thinking about this quite a bit and have a possible theory.

Excel assigns a result handle for each calculation. The code then does it's work and writes to that handle. This works successfully.

In the 10x10 grid you change a value and excel triggers another calculation but with the new return handle pointer invalidating the old one. When the previous execution completes it tries to now write to a invalid address.

The question is: is there a way to check if a handle is still valid on completion of a calculation before writing to it?

vron commented 2 years ago

(did you manage to re-createit?)

I agree it's a possible theory - but i don't think so - i have been watching the number of threads reported for exccel running, and if that theory was correct the number should go up when i force a recalc before the last one finsihes which it doesnt. Could test by marking the async function as non-thread-safe?

It sounds a bit strange though if excel does it that way rather than creating a new handle for each calculation - it's like as if they were aksing for this problem.

I havnt found any way for checking the validity of the handle - the asyncreturn is documented as the only function safe to call from a non udf thread in the docs at least.

(Also Im 99.9% sure i have maanged to recreate this problem without recaling before finsihed)

ronniec95 commented 2 years ago

Hi

I don't know if you are still looking at this issue, but I think it might be due to async functions being marked as threadsafe and I think they should NOT be. I've updated the package to 0.7 which marks async functions as single threaded to Excel. This may fix the issue.

vron commented 2 years ago

Great!

Thanks - i Will try it out during this week and report back

Sent from my phone

On 17 Jan 2022, at 11:05, ronniec95 @.***> wrote:

 Hi

I don't know if you are still looking at this issue, but I think it might be due to async functions being marked as threadsafe and I think they should NOT be. I've updated the package to 0.7 which marks async functions as single threaded to Excel. This may fix the issue.

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.