holochain / lair

secret lair private keystore
Apache License 2.0
42 stars 11 forks source link

`InProcKeystore` fails on restart #120

Open guillemcordoba opened 10 months ago

guillemcordoba commented 10 months ago

This issue has diverged into android load timing. I've split out the load timing into a separate issue: https://github.com/holochain/lair/issues/139


Original Issue text retained:


I've tried to use directly an InProcKeystore with holochain, to reduce the startup time of my Android app. It works fine when first started, but when the conductor restarts, signing zome calls fails with Query returned no rows. This is my code:

    let config = get_config(&fs.keystore_config_path(), passphrase.clone())
        .await
        .map_err(|err| crate::Error::LairError(err))?;

    let store_factory = lair_keystore::create_sql_pool_factory(&fs.keystore_store_path());

    // create an in-process keystore with an in-memory store
    let keystore = InProcKeystore::new(config.clone(), store_factory, passphrase.clone())
        .await
        .map_err(|err| crate::Error::LairError(err))?;

    let client = keystore
         .new_client()
         .await
         .map_err(|err| crate::Error::LairError(err))?;

     let lair_client = MetaLairClient::new_with_client(client.clone())
         .await
         .map_err(|err| crate::Error::LairError(err))?;

    let connection_url = config.connection_url.clone();

    let lc = lair_client.clone();

    log::info!("Lair keystore spawned");

    tauri::async_runtime::spawn(async move {
        let config = crate::config::conductor_config(
            &fs,
            admin_port,
            connection_url.into(),
            override_gossip_arc_clamping(),
        );

        let conductor = Conductor::builder()
            .config(config)
            .passphrase(Some(passphrase))
            .with_keystore(lc)
            .build()
            .await
            .expect("Can't build the conductor");

        let p: either::Either<u16, AppInterfaceId> = either::Either::Left(app_port);
        conductor
            .clone()
            .add_app_interface(p)
            .await
            .expect("Can't add app interface");
    });

To do that, I've needed to fork holochain and add this method.

I've tried to reproduce the issue by implementing a test in this repository, which is mostly a copy of this test, here, but that fails with another error:

     Running tests/in_proc_sql.rs (target/debug/deps/in_proc_sql-05e6d2769c280074)

running 1 test
test in_proc_sql ... FAILED

failures:

---- in_proc_sql stdout ----
connectionUrl: unix:///socket?k=CZHNtNI5GCBYmxuMdWYYVSS3-7R6wOZkx_4iOI5zfys

# The pid file for managing a running lair-keystore process
pidFile: /pid_file

# The sqlcipher store file for persisting secrets
storeFile: /store_file

# Configuration for managing sign_by_pub_key fallback
# in case the pub key does not exist in the lair store.
# - `signatureFallback: none`
# - ```
#   signatureFallback: !command
#     # 'program' will resolve to a path, specifying 'echo'
#     # will try to run './echo', probably not what you want.
#     program: "./my-executable"
#     # args are optional
#     args:
#       - test-arg1
#       - test-arg2
#   ```
signatureFallback: none

# -- cryptographic secrets --
# If you modify the data below, you risk losing access to your keys.
runtimeSecretsSalt: ooG0RUxi1JEdD3X29eX-wQ
runtimeSecretsMemLimit: 67108864
runtimeSecretsOpsLimit: 2
runtimeSecretsContextKey:
- PZf3TQcscJKOfY3NM2T5oYjuV7SNLBr1
- Zl7vhBHMFgEhudDk-6dWhvAbYJJYS1Pznh8OcXBRs_Ac3cQPcaXaPXnYfvOqjCICSQ
runtimeSecretsIdSeed:
- se1u7x7-l5B8Y0WnqnQ8zQOomgFJoxo0
- 6RIMEN_B-YJEUFDKkMi9yDgAQyCvpDQg3uKPDgJ7CNZLHCSLE5El_4SLam16B97aRQ

[
    Seed {
        tag: "test-tag",
        seed_info: SeedInfo {
            ed25519_pub_key: BinDataSized<32>(oVB-1RAn5jaa7age6crq_0gkZ1bg0aNS_II7Aj2ycr4),
            x25519_pub_key: BinDataSized<32>(w1vx2mOy3RXxBok6VktHnWT-4L-rfn2-2n9qWqcl8h8),
            exportable: false,
        },
    },
    DeepLockedSeed {
        tag: "test-tag-deep",
        seed_info: SeedInfo {
            ed25519_pub_key: BinDataSized<32>(heKKE7_gIqWnthzSYR1DwCKvjGWQ8-rc9jtJcUl2iPI),
            x25519_pub_key: BinDataSized<32>(Y-2JERtk89mS2E3pMP4e6TowiKUm4ZIy8Cgo8NaC9Hg),
            exportable: false,
        },
    },
]
thread 'in_proc_sql' panicked at 'called `Result::unwrap()` on an `Err` value: {"error":"InternalSodium"}', crates/lair_keystore/tests/in_proc_sql.rs:124:14
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

failures:
    in_proc_sql

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 4.28s

error: test failed, to rerun pass `-p lair_keystore --test in_proc_sql`

Is InProcKeystore supposed to work? Am I doing something wrong? This would be a big improvement on the UX experience of mobile apps.

ThetaSinner commented 10 months ago

Yes it's supposed to work, it's the default in Holochain if you don't configure an external Lair to connect to.

@neonphog any ideas about these errors?

guillemcordoba commented 10 months ago

It actually doesn't @ThetaSinner . The default is to run an lair-keystore server which uses StandaloneServer which uses the IpcKeystoreServer, which is not the same as the InProcKeystore. Running the server is what I've found to be really slow (5 seconds) on mobile, and I was hoping that doing away with that would makes things much faster.

neonphog commented 10 months ago

@guillemcordoba - The main difference between the ipc keystore and the in proc keystore is whether they use an ipc channel. The "Standalone" is a bit misleading, as it actually will also be run in process.

I suspect the startup time has more to do with the argon password hashing algorithm, as it is designed specifically to use a lot of memory and CPU, and is using the "Moderate" settings which largely target desktops instead of mobile devices.

Can you run a test where you set the mem limit and cpu/ops limit to mem min and ops min respectively?

For testing, you could potentially set them here. If that makes a difference, we can figure out the best way forward for actually configuring that...

neonphog commented 10 months ago

Not sure it's more readable, but looks like LairServerConfigInner::new respects PwHashLimits::with_exec if you want to set the limits that way.

guillemcordoba commented 10 months ago

Thanks @neonphog that helps. I'll do some testing and see if that reduces the start time.

abe-njama commented 10 months ago

@guillemcordoba were you able to successfully test withe the configs that @neonphog shared? What's the experience like?

guillemcordoba commented 10 months ago

I just did, I tried to do it with this code:


pub async fn write_config(
    config_path: &std::path::Path,
    passphrase: BufRead,
) -> LairResult<LairServerConfig> {
    let lair_root = config_path
        .parent()
        .ok_or_else(|| one_err::OneErr::from("InvalidLairConfigDir"))?;

    tokio::fs::DirBuilder::new()
        .recursive(true)
        .create(&lair_root)
        .await?;

    let mut config = LairServerConfigInner::new(lair_root, passphrase).await?;

    config.runtime_secrets_mem_limit = sodoken::hash::argon2id::MEMLIMIT_MIN;
    config.runtime_secrets_ops_limit = sodoken::hash::argon2id::OPSLIMIT_MIN;

    let mut config_f = tokio::fs::OpenOptions::new()
        .write(true)
        .create_new(true)
        .open(config_path)
        .await?;

    config_f.write_all(config.to_string().as_bytes()).await?;
    config_f.shutdown().await?;
    drop(config_f);

    Ok(Arc::new(config))
}

Using this code, lair doesn't even start. As a matter of fact, I lose all logs and I don't know at which point it gets stuck, which is really weird. But I'm certain that the whole holochain start up process doesn't complete at all. I can't offer much more on my end, I think this needs thorough investigation on how we can make both lair-keystore and holochain run as fast as possible in mobile.

abe-njama commented 9 months ago

@neonphog from Guillem's update, it looks like we might want to prioritize debugging lair startup on mobile or optimizing the process. We can discuss this internally in light of the other competing priorities.

neonphog commented 9 months ago

@guillemcordoba - can you give this a try: https://github.com/holochain/holochain/pull/3391 Thanks!

guillemcordoba commented 9 months ago

Will do! Thanks @neonphog

guillemcordoba commented 8 months ago

Hi @neonphog , sorry for taking long with this.

I just tried it, and it hasn't made any difference as far as I could tell. I timed the startup times with the setting set to Interactive and the setting set to None, and both startup times took around 7s with no noticeable difference in them.

Any ideas on things that we could try next?

guillemcordoba commented 8 months ago

Also ping @abe-njama about this.

neonphog commented 1 month ago

@guillemcordoba - I've split out the timing into a separate issue here: https://github.com/holochain/lair/issues/139 -- This issue will be closed with a test that shows InProcKeystore is able to load after restart.