rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
95.41k stars 12.29k forks source link

Random `lifetime may not live long enough` from other part of the code #96645

Open DzenanJupic opened 2 years ago

DzenanJupic commented 2 years ago

I just stumbled across a weird and inconsistent behaviour, that seems unreasonable to me.

The project I'm working on is a big company project (closed source) with quite some files. While fixing a bug and editing one line in one module (src/dispatcher/compiler/module_range.rs) the compiler suddenly decided to reject the code of a function from a completely different module (src/services/module_state_service.rs) that has absolutely nothing to do with the module I'm working on.

The change that I made:

fn get_next_module_start(&self, module: &FunctionalModule<'a>, starter: &'a Fmid) -> Result<usize> {
        // -- snip --
        WorkflowIterator::exe_path_iter(
            self.modules,
            starter,
            |m| m.fmd.next_module.as_ref().into_option(),
-           |m| super::sort_exe_paths(&m.fmd.exe_paths),
+           |m| (m != module).then(|| super::sort_exe_paths(&m.fmd.exe_paths)).into_iter().flatten(),
            |m| m.md.kind,
        )
        // -- snip --
}

The compiler error:

error: lifetime may not live long enough
  --> src/services/module_state_service.rs:35:44
   |
32 |           &'_ self,
   |            -- let's call the lifetime of this reference `'1`
33 |           workflow_id: &WorkflowGroupId,
34 |           fmid: &Fmid,
   |                 - let's call the lifetime of this reference `'2`
35 |       ) -> Option<RwLockReadGuard<'_, [u8]>> {
   |  ____________________________________________^
36 | |         let map_guard = self.module_state.read().await;
37 | |
38 | |         let contains_state = map_guard
...  |
49 | |         }))
50 | |     }
   | |_____^ associated function was supposed to return data with lifetime `'1` but it is returning data with lifetime `'2`

error: lifetime may not live long enough
  --> src/services/module_state_service.rs:47:9
   |
32 |           &'_ self,
   |            -- let's call the lifetime of this reference `'1`
33 |           workflow_id: &WorkflowGroupId,
34 |           fmid: &Fmid,
   |                 - let's call the lifetime of this reference `'2`
...
47 | /         Some(RwLockReadGuard::map(map_guard, |map| {
48 | |             &**map.get(workflow_id).unwrap().get(fmid).unwrap()
49 | |         }))
   | |___________^ associated function was supposed to return data with lifetime `'2` but it is returning data with lifetime `'1`

The function the compiler complains about:

pub async fn get_module_state(
    &'_ self,
    workflow_id: &WorkflowGroupId,
    fmid: &Fmid,
) -> Option<RwLockReadGuard<'_, [u8]>> {
    let map_guard = self.module_state.read().await;

    let contains_state = map_guard
        .get(workflow_id)
        .map(|map| map.contains_key(fmid))
        .unwrap_or_default();

    if !contains_state {
        return None;
    }

    Some(RwLockReadGuard::map(map_guard, |map| {
        &**map.get(workflow_id).unwrap().get(fmid).unwrap()
    }))
}

When undoing the one-line change, the code compiled fine again, and after redoing it, it failed to compile again. Now, after indenting a few pieces of code, to make pasting them into GitHub easier, the compiler again accepts the code. I'm also no longer able to recreate the error.

I know, that's essentially nothing to work with, and I'm not even able to recreate the issue myself anymore, but it seems worth reporting to me.

DzenanJupic commented 2 years ago

Could be closed immediately if not of value.

Edgeworth commented 2 years ago

This has also been happening to me on recent nightlies (for the past week or two). Unfortunately, it does sound like a hard to repro bug. I have found this problem happens much more in rust-analyzer than in command line compiles, not sure if that means anything though.

As a workaround, I am doing a clean rebuild.

Edgeworth commented 2 years ago

This is now happening constantly for me (almost every build). Unfortunately, I don't have a repro. It does seem to be somehow related to rust-analyzer though (very low confidence for this).

DzenanJupic commented 2 years ago

I don't use rust-analyzer, so it's probably not the origin. Though I didn't have any more problems so far.

Demindiro commented 2 years ago

I also had this happen though I haven't been able to reproduce it either. As far as I can tell it's related to async functions since I haven't seen it occur in any other case. Specifying lifetime bounds manually seems to help though.

audunska commented 2 years ago

Seeing something similar in our closed-source module. The beta compiler (1.62.0-beta.4) suddenly started rejecting some sqlx code which looks like this (details omitted):

#[derive(Debug, sqlx::FromRow)]
struct Grid {
  id: i64,
  row: f32,
};

struct Target {
  id: i64,
  grid_id: Option<i64>,
  ...
}

/// Just for namespacing
struct GridSql;

impl GridSql {
    pub async fn add_empty<'t>(
        target: &Target,
        mut tx: sqlx::Transaction<'t, Postgres>,
    ) -> sqlx::Result<sqlx::Transaction<'t, Postgres>> {
        let grid_id = sqlx::query_as!(
            Grid,
            r#"
            INSERT INTO grid (
               row
            ) VALUES($1)
            RETURNING
                id, ...
            "#,
            0.0,
        )
        .fetch_one(&mut tx)
        .await?
        .id;
        sqlx::query!(
            "UPDATE target SET grid_id = $1 WHERE id = $2",
            grid_id,
            target.id,
        )
        .execute(&mut tx)
        .await?;
        Ok(tx)
    }
}

The error message is:

error: lifetime may not live long enough
  --> src/lib/grid.rs:70:56
   |
67 |       pub async fn add_empty<'t>(
   |                              -- lifetime `'t` defined here
68 |           target: &Target,
   |                         - let's call the lifetime of this reference `'1`
69 |           mut tx: sqlx::Transaction<'t, Postgres>,
70 |       ) -> sqlx::Result<sqlx::Transaction<'t, Postgres>> {
   |  ________________________________________________________^
71 | |         let grid_id = sqlx::query_as!(
72 | |             Grid,
73 | |             r#"
...  |
97 | |         Ok(tx)
98 | |     }
   | |_____^ associated function was supposed to return data with lifetime `'t` but it is returning data with lifetime `'1`

error: lifetime may not live long enough
  --> src/lib/grid.rs:97:9
   |
67 |     pub async fn add_empty<'t>(
   |                            -- lifetime `'t` defined here
68 |         target: &Target,
   |                       - let's call the lifetime of this reference `'1`
...
97 |         Ok(tx)
   |         ^^^^^^ associated function was supposed to return data with lifetime `'1` but it is returning data with lifetime `'t`

This does not happen on stable.

Unfortunately, pulling the code out of the codebase verbatim did not give an error, so creating a reproducible example seems very hard again.

DzenanJupic commented 2 years ago

Hopefully, it has nothing to do with switching to the MIR borrow checker 😬. As far as I know, the plan is to stabilise it in August. Since @Edgeworth and I first noticed the bug on nightly, and it now propagated to beta this might be the case.

I have no idea when the NLL borrow checking was merged though, so it might be something different.

DzenanJupic commented 2 years ago

@rustbot label +A-NLL +NLL-complete

Edgeworth commented 2 years ago

By the way, I just found a breaking case for this even in the explicitly specified lifetimes case.

Originally, I had this, which sometimes produced the error message in the original post. pub async fn rs(&mut self, key: &RecKey) -> Result<&RecSeries>

The frequency of this error decreased after I changed it to this signature:

pub async fn rs<'a, 'b>(&'a mut self, key: &'b RecKey) -> Result<&'a RecSeries>

However, I just got this error recently:

lifetime may not live long enough
consider adding the following bound: `'b: 'a`

Note that there is no reason for 'b to have to outlive 'a in my actual code (and indeed, it almost always compiles without issue).

DzenanJupic commented 2 years ago

Looks like one of the new error messages of the NLL borrow checker: https://jackh726.github.io/rust/2022/06/10/nll-stabilization.html

audunska commented 2 years ago

For what it's worth, I haven't seen this since upgrading to the 1.63 beta.

Jasperav commented 2 years ago

I had this problem as well. I updated to 1.64 and the problem didn't occur when building clean.

Jasperav commented 2 years ago

Now I got the problem on 1.64 after a few builds, clean builds help but they take forever

estebank commented 1 year ago

Nominating for @rust-lang/compiler discussion because it seems like a latent bug or regression in incr-comp with nll borrowck.

DzenanJupic commented 1 year ago

Might be relevant: https://www.reddit.com/r/rust/comments/xercaw/odd_compile_error_which_resolves_with_new_line

apiraino commented 1 year ago

Discussed during T-compiler meeting on Zulip (notes).

@rustbot label -I-compiler-nominated

DzenanJupic commented 1 year ago

I found a way to reliably reproduce the error locally (win 10). It, unfortunately, does not work in WSL or docker 😐. Just published a repository, with the code I used to reproduce it: https://github.com/DzenanJupic/96645-reproduction

It's more or less a fuzzer, that generates random code from all the examples that were posted in this thread. It also sets up the exact same toolchain situation as on my local machine at the time I encountered the error (default: nightly-2022-07-12, override: stable-2022-06-30).

To run it, just execute cargo run. !!! It will install new toolchains, and change your default toolchain !!! It takes about max one or two minutes for me (when I run it in win 10)

Since I cannot reproduce it in WSL or docker, I included 5 target directories and Cargo.locks of times when it failed: https://github.com/DzenanJupic/96645-reproduction/tree/master/reproductions

Hopefully, that helps :) If I should run any other command locally, or I should share other files, I can do so.

DzenanJupic commented 1 year ago

Cannot reproduce it with the latest stable/nightly on my machine

lqd commented 1 year ago

I feel there are multiple issues at play here, due to the multiple lifetime resolution on AST & incremental compilation issues, whose introduction and fixes spanned a release.

Some of these examples look similar to #98890, which matches that timeline:

Some also look coincidentally fixed by https://github.com/rust-lang/rust/pull/97415 a few weeks beforehand (e.g. if you simply compile module B and then concat D, that reproduces the issue but the PR fixes that)

Some commenters mention having the issue on 1.64.0 nightlies but we don't have reproductions for those yet.

apiraino commented 1 year ago

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-medium