https://github.com/ellie/atuin/pull/933

juntao commented 1 year ago

Hello, I am a code review bot on flows.network. Here are my reviews of changed source code files in this PR.

Cargo.lock

Since the provided source code is a Cargo.lock file, it primarily lists dependency versions and their source locations. It does not contain functional source code to review for potential problems.

However, one can suggest the following considerations while working with this Cargo.lock file:

Dependencies should be up-to-date. You may want to use cargo outdated command to check for outdated dependencies.
Confirm that the sources, such as registry URLs, are correct and can be accessed.
Make sure that the correct versions of dependencies are being used. If you encounter any issues with a specific version, consider downgrading or upgrading the version in Cargo.toml.
Examining dependencies' checksums to ensure that they have not been tampered with can be helpful.

Note that these points are not issues found in the code; they are general suggestions to consider while working with dependency management in Rust projects.

The patch introduces the following key changes:

A new dependency, "typed-builder", is added to the list of dependencies.
The version of the "typed-builder" package is set to "0.14.0".
The source for the "typed-builder" package is specified as the official Rust crates.io registry.
The checksum for the "typed-builder" package is provided to ensure the integrity of the package source.
The new package "typed-builder" has three dependencies of its own: "proc-macro2", "quote", and "syn 1.0.99".

Overall, the Cargo.toml file looks well-structured and includes appropriate dependencies and features for the project. However, there are a few suggestions to improve the file organization and maintainability:

Group the dependencies alphabetically: This makes it easier for other developers to locate specific dependencies quickly.
Use version specifiers consistently: You can decide between using workspace versions or specific version numbers for your dependencies but try to apply chosen method consistently. Mixing both types could lead to potential issues with versioning later on.
Make sure all dependencies have version numbers: Specifying version numbers for all dependencies can help avoid potential conflicts and breaking changes when dependencies are updated.
Keep optional dependencies at the bottom: This makes it easier to distinguish between required and optional dependencies.
Consider adding comments for dependencies: Adding comments to briefly explain the purpose of each dependency can be helpful for maintaining the project and also for other developers who work with the code.

Here's an updated version of the Cargo.toml file, which addresses these suggestions:

[package]
name = "atuin-client"
edition = "2018"
description = "client library for atuin"

version = { workspace = true }
authors = { workspace = true }
license = { workspace = true }
homepage = { workspace = true }
repository = { workspace = true }

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[features]
default = ["sync"]
sync = [
  "urlencoding",
  "reqwest",
  "sha2",
  "hex",
  "rmp-serde",
  "base64",
  "generic-array",
  "xsalsa20poly1305",
]

[dependencies]
async-trait = { workspace = true }
atuin-common = { path = "../atuin-common", version = "14.0.1" }
chrono = { workspace = true }
clap = { workspace = true }
config = { workspace = true }
directories = { workspace = true }
eyre = { workspace = true }
fs-err = { workspace = true }
interim = { workspace = true }
itertools = { workspace = true }
lazy_static = "1"
log = { workspace = true }
memchr = "2.5"
minspan = "0.1.1"
parse_duration = "2.1.1"
regex = "1.5.4"
serde = { workspace = true }
serde_json = { workspace = true }
serde_regex = "1.1.0"
shellexpand = "2"
sql-builder = "3"
sqlx = { workspace = true, features = ["sqlite"] }
typed-builder = "0.14.0"
uuid = { workspace = true }
whoami = { workspace = true }

# Optional dependencies
base64 = { workspace = true, optional = true }
generic-array = { version = "0.14", optional = true, features = ["serde"] }
hex = { version = "0.4", optional = true }
rmp-serde = { version = "1.1.1", optional = true }
reqwest = { workspace = true, optional = true }
sha2 = { version = "0.10", optional = true }
sodiumoxide = { workspace = true, optional = true }
urlencoding = { version = "2.1.0", optional = true }
xsalsa20poly1305 = { version = "0.9.0", optional = true }

[dev-dependencies]
tokio = { version = "1", features = ["full"] }

Please, apply these suggestions and update any version numbers that may have changed. It will help to maintain the project and will improve the readability of the Cargo.toml file.

A new dependency "typed-builder" with version "0.14.0" has been added to the Cargo.toml file. This library helps with creating complex structures in a type-safe manner using the builder pattern.

atuin-client/src/database.rs

Overall, the code is organized, readable, and follows Rust best practices. However, I have noticed a few points which could be improved:

In the current_context function, it's better to handle error messages and exit status with a proper Result type instead of using eprintln! and std::process::exit directly. You can change the function signature to return a Result<Context> and handle the error message properly in the caller. This would also improve the testability of this function.

pub fn current_context() -> Result<Context> {
    let session = env::var("ATUIN_SESSION").map_err(|_| {
        "ERROR: Failed to find $ATUIN_SESSION in the environment. Check that you have correctly set up your shell."
    })?;

    let hostname = format!("{}:{}", whoami::hostname(), whoami::username());
    let cwd = utils::get_current_dir();

    Ok(Context {
        session,
        hostname,
        cwd,
    })
}

The search function has many arguments and uses a mutable SqlBuilder for constructing the query. To make this more readable and maintainable, you can encapsulate these arguments and settings in a SearchParams struct with a builder pattern, like you mentioned in the comment.
It is not recommended to use wildcard imports like use super::*;. it's better to be explicit about what you are importing, which makes it easier for readers to understand where a particular item comes from.
The save_bulk function does not use any batching, which may lead to issues with performance when saving a large number of records. You can use the extend method provided by sqlx to properly use batch processing for inserts.
To improve the readability of the long queries, it is a good practice to use raw strings with triple quotes instead of normal strings.
In the test module, instead of using unwrap() on results, using ? operator is recommended since it returns Error type to the caller, and it's easier for you to understand the cause of the error if any problem occurs.

This patch includes the following key changes:

In the query_history function, instead of directly initializing a History instance, a History::from_db() method is used with a builder pattern, which allows setting individual properties in a more readable and maintainable way.
Similarly, in the new_history_item function, a History::capture() method with a builder pattern is being used to create a History instance, which makes it easier to understand and manage the properties being set.

These changes make the code more modular, maintainable, and easier to understand.

atuin-client/src/encryption.rs

Overall, the code seems well-structured and seems to serve its purpose of encrypting and decrypting histories. However, there are a few points that should be noted:

In the decrypt function, the error message says "could not encrypt" when it should say "could not decrypt".
```
.map_err(|_| eyre!("could not encrypt"))?;
```
should be changed to
```
.map_err(|_| eyre!("could not decrypt"))?;
```
Error handling can be improved by using custom error types. Instead of just returning a string error message like "could not encrypt", you can create a custom error variant for the different possible errors in this module.
When creating a new key or loading an encoded key, you are writing the key to a file. It's crucial to ensure that the file permissions are restricted only to the owner to secure the keys, especially on a shared machine. You can use the set_permissions method available in the std::fs module to make sure the key file is only accessible by the owner.
It's not mandatory, but you can consider using an external crate for handling paths (e.g., dunce or path_abs). The use of PathBuf::from(path).exists() in the load_key and load_encoded_key functions can be replaced by path.exists() with the mentioned crates, leading to cleaner code.
When using the OsRng random number generator, it's better to import it from the rand_core crate directly instead of from xsalsa20poly1305::aead::OsRng. This will make the code more maintainable if you ever decide to use a different encryption algorithm in the future.

Other than these points, the code seems fine and should work as intended. The test case provided adequately verifies the functionality.

The patch includes the following key changes in the code:

Removed the import of HistoryWithoutDelete from use crate::{history::History, settings::Settings};. HistoryWithoutDelete is no longer needed in the code.
Modified the decrypt function by removing the separate handling of History and HistoryWithoutDelete. This change simplifies the function:
- Removed the else block that handled HistoryWithoutDelete.
- Kept and updated the line let history = rmp_serde::from_slice(&plaintext)?;.
Updated the test case in the mod test section:
- Replaced the History::new() constructor with the builder pattern, which is more readable and easier to modify in the future.
- Changed .id("1".into()) and so on, for all the other fields, to make the code cleaner and more descriptive.

atuin-client/src/history.rs

The code seems well-written overall with proper documentation, error handling, and builders. However, I have some suggestions that may improve the code.

Consider using Option instead of the HistorySeal type.

The use of HistorySeal to prevent direct construction of History structs may be an unnecessary complication. You can achieve the same result by changing sensitive fields to use the Option<T> type with pub(crate) access level, preventing external code from setting these fields.

Unit tests may not cover all the edge cases.

The tests provided seem to focus on deserialization, which is important, but it would be good to have tests for other functionality as well, such as the construction of history entries via builders.

Consider implementing Default on the History struct.

Instead of having a separate new function, you can implement the Default trait for the History struct. This way, you can use History::default() and set other fields using the builders.

Note that you should still be cautious when implementing the Default trait, especially when dealing with types that should only be constructed using specific builder patterns.

Inspect hostname username creation.

The new function derives the hostname using whoami::hostname() and whoami::username() to create a string in the format hostname:username. Ensure that this format is correct for your use case and compatible with other parts of your application.

Overall, the code appears to be of high quality, and these suggestions should help you make further improvements.

This patch introduces several key changes to the code:

A HistorySeal marker type has been added, preventing the direct construction of the History struct. This ensures that only builders from the crate::history module, such as History::import(), History::from_db(), and History::capture(), can construct the History struct. The _seal field has been added to the History struct to make use of this.
A new builder module has been added containing builders for creating history entries in different situations, such as importing from shell history, capturing via a hook, or creating from a database entry.
Three new public builder constructors have been introduced: History::import() for importing an entry from the shell history file, History::capture() for capturing an entry via hook, and History::from_db() for creating an instance from the database entry.
The documentation has been expanded to provide better explanations and examples for using the new builders.
Unit tests have been added to test the new functionality, specifically focusing on deserialization and compatibility between old and new client versions.

Overall, these changes make the code more robust and maintainable by enforcing the use of builder patterns and providing clearer documentation for creating History structs.

atuin-client/src/history/builder.rs

Overall, the code seems to be well structured, using typed_builder to build different implementations of a history entry. Here are a few suggestions to improve the code:

Consistency in default values: In the HistoryImported struct, default values for both exit and duration are set to -1. Consider replacing these magic numbers with a named constant to improve code readability and maintainability:

const UNKNOWN_VALUE: i64 = -1;

#[builder(default = UNKNOWN_VALUE)]
exit: i64,
#[builder(default = UNKNOWN_VALUE)]
duration: i64,

Optional fields: In the HistoryImported struct, since the exit, duration, session, and hostname fields have default values, consider making them Option<T> and set the default value to None. This makes the code more expressive and aligns better with the concept of an optional field.

#[builder(default)]
exit: Option<i64>,
#[builder(default)]
duration: Option<i64>,
#[builder(default, setter(strip_option, into))]
session: Option<String>,
#[builder(default, setter(strip_option, into))]
hostname: Option<String>,

Error handling: Consider adding error handling for any possible issues during conversion using try_from instead of from. This would require changing the From trait implementation to TryFrom and returning a Result type.

use std::convert::TryFrom;

impl TryFrom<HistoryImported> for History {
    type Error = Error; // Define the Error type here

    fn try_from(imported: HistoryImported) -> Result<Self, Self::Error> {
        // ...
    }
}

These suggestions should help improve the code's readability, expressiveness, and error handling.

This patch defines three structs with TypedBuilder derivations, namely HistoryImported, HistoryCaptured, and HistoryFromDb. These structs are used to build history entries representing different sources:

HistoryImported: A history entry imported from shell history. It has default values for cwd, exit, duration, session, and hostname.
HistoryCaptured: A history entry captured via hook. Fields specific to command completion such as exit and duration are not present. Default-initialized fields are omitted in this struct.
HistoryFromDb: A history entry loaded from the database. All fields are required as they are all present in the database.

In addition, the patch contains From trait implementations for the conversion from each of the structs into the History struct.

atuin-client/src/import/bash.rs

The code itself looks quite clean and well-structured. I have identified a few points for improvement or discussion:

Error handling in LineType and try_parse_line_as_timestamp

The LineType::NotUtf8 variant signifies an invalid UTF-8 sequence in input data. Currently, any non-UTF-8 lines are silently skipped in the load method. However, it might be better to either provide a warning or an error message when non-UTF-8 input is encountered so that the user is aware of it.

Duration in nanoseconds

The usage of nanoseconds as the time unit for timestamps might be too fine-grained. It's generally safe to use milliseconds as the smallest unit for practical applications. It could improve performance and simplify the codebase if you use milliseconds instead.

Usage of impl in generic bounds

In the method signature for load, you have used impl Loader. This syntax is more idiomatic when used as return types for functions, and it might be better to use a generic type parameter such as load<L: Loader>(self, h: &mut L) -> Result<()>.

Test naming convention

In the test module, test function names use the snake_case naming convention. It would be more idiomatic to use a more descriptive name for the tests, using underscores to separate distinct parts. For example, parse_no_timestamps could be renamed to parse_history_without_timestamps.

The comment about LineType::Empty is not clear.

The comment says, "// Can happen when using the "here document" syntax.". The "here document" syntax refers to a syntactical construction in the shell where an arbitrary delimiter (usually EOF) can be used to tell where the input (or the contents of the here document) ends. Let's say there are multiple lines of commands and there might be empty lines inside; the comment saying it happens when using a "here document" syntax is not clear.

You may consider rephrasing this comment to explain more clearly why empty lines may be encountered in the .bash_history file.

Overall, the codebase is well written, and the identified improvements are generally about idiomatic practices and clearer error handling.

The patch simplifies the construction and insertion of a new history entry within the LineType::Command(c) match arm. Key changes are:

Instead of using History::new() to create a new entry and then pushing it to the history, the patch utilizes a builder, History::import().
The builder helps in chaining the construction of the History object with .timestamp(next_timestamp) and .command(c) method calls. This removes the need for multiple parameters in the History::new() call.
Finally, imported.build().into() is directly used in the h.push() call, removing the necessity for an intermediate entry variable. This makes the code cleaner and easier to follow.

atuin-client/src/import/fish.rs

Overall, the Fish-Importer looks good. However, there are a few suggestions I have to improve the code:

Move the contents of default_histpath() to a separate function to check XDG_DATA_HOME's presence, and update default_histpath() to use it. This separation can make the code more modular and easier to read:

fn get_xdg_data_home() -> Result<PathBuf> {
    let base = BaseDirs::new().ok_or_else(|| eyre!("could not determine data directory"))?;
    Ok(std::env::var("XDG_DATA_HOME").map_or_else(
        |_| base.home_dir().join(".local").join("share"),
        PathBuf::from,
    ))
}

fn default_histpath() -> Result<PathBuf> {
    let data = get_xdg_data_home()?;

    // ...
}

In load(), consider using if-let instead of a match for std::str::from_utf8(b) to make the code more readable:

for b in unix_byte_lines(&self.bytes) {
    if let Ok(s) = std::str::from_utf8(b) {
        // ...
    }
}

In load(), when processing command entries, consider using a separate function to process the escape characters, so that the code is more readable:

fn process_escape_chars(cmd: &str) -> String {
    let c = cmd.replace(r"\\", r"\");
    let c = c.replace(r"\n", "\n");
    // TODO: any other escape characters?
    c
}

// ...

if let Some(c) = s.strip_prefix("- cmd: ") {
    // ...

    let c = process_escape_chars(c);

    cmd = Some(c);
}

In the test module, you can use include_bytes!() macro to move the test's content to a separate file. This can help to declutter and organize test files:

const TEST_DATA: &[u8] = include_bytes!("test_data/parse_complex_fish_history.input");

#[tokio::test]
async fn parse_complex() {
    let bytes = TEST_DATA.to_vec();

    // ...
}

The patch includes key changes in the load() method of the Importer implementation for Fish when pushing a new history entry to the loader.

Instead of creating a History struct directly and passing its parameters, a new entry is created using the builder pattern by calling History::import().timestamp(time).command(cmd) which returns a builder object.
The builder is then used with entry.build().into() to create the final History struct and it is pushed to the loader with loader.push(entry.build().into()).await?;.

These changes make the code cleaner and easier to read while maintaining the same functionality.

atuin-client/src/import/nu.rs

Below are the potential issues and suggestions for improvement in the given code:

Error Handling: In the new() and load() methods of the NorcImporter trait implementation for Nu, we could improve error handling by using more descriptive error messages:

// Inside `new()`
f.read_to_end(&mut bytes).map_err(|e| eyre!("Failed to read history file: {}", e))?;

// Inside `load()`
Err(_) => {
    eprintln!("Skipping invalid UTF-8 entry in history"); // Log the reason for skipping a history entry
    continue;
}

File I/O: To improve performance, consider using async read operations (async_std::fs::File or tokio::fs::File) for file reads within an async function like the new() method:

use async_std::fs::File;
use async_std::io::Read;

// Inside `new()`
let mut f = File::open(path).await?;
f.read_to_end(&mut bytes).await?;

Code Documentation: Add comments explaining what each function does and explaining the purpose of each struct. This can help other developers understand the code better.

Other than the mentioned areas, the code seems well-structured and follows common conventions.

The key changes in the patch are:

The old code used the History::new() function with default parameters for some fields like source and duration. This was not the best way to create a History entry and was messy.
The patch introduces the use of the History::import() function which creates a History entry with a more pleasant, builder-style syntax. The timestamp() function is called to set the timestamp and the command() function is called to set the command for the entry.
Instead of directly pushing the History entry to the Loader, the patch builds the entry using the build() function, and then converts it to the desired format using the into() function. Finally, the entry is pushed to the loader using await.

atuin-client/src/import/nu_histdb.rs

Overall, the code is well-structured and easy to understand. However, there are a few potential issues and improvements that could be done in the code:

Error handling in From<HistDbEntry> for History: When converting from HistDbEntry to History, unwrap() is used to handle the conversion of Vec<u8> to String. This may result in a panic if the conversion fails. To avoid this, consider using unwrap_or or unwrap_or_else with a fallback value or propagate the error using the proper error handling method in your application.
Hardcoded strings for DB connection and table names: The Sqlite connection string and query string include hardcoded values such as .sqlite3 extension, table, and column names. It would be better to define these as constants or use a configuration file, so it is easier to maintain and change in the future.
Commented out code: The comment // Not sure how this is used indicates that the purpose of the constant NAME is unclear. It would be helpful to either provide more context or remove it if it's not being used.
Clarity for histpath method in NuHistDb: The method histpath() returns a Result<PathBuf> and also checks for the existence of the file. However, it could be better if we separate the responsibilities to improve code readability. Create one method to get the path and another method to check for file existence.
Use of PathBuf in the method argument of hist_from_db might not be necessary: It may be sufficient to use &Path instead of PathBuf, as we are not modifying the path inside the function.
The comment at the beginning of the code mentions 'import old shell history', but it's not clear what kind of shell history is being imported. Providing more context or documentation will help other developers understand the purpose of the code better.
In the load method, consider handling the Result returned by h.push(i.into()).await?; instead of using the question mark operator. This would allow you to perform any necessary cleanup or return a more specific error, if needed.

With these improvements, the code would be easier to maintain, extend and understand for future developers. Overall, the code is well-written and follows good software development practices.

The key changes in this patch are:

Modification of the from method in the impl From<HistDbEntry> for History block: Previously, the method used the History::new() function to create a new History object. The patch changes this to use the History::import() method followed by a chain of additional methods to set the object's properties.
Introduced the imported.build().into() call at the end of the chain, which builds the History object and converts it into the appropriate type.

These changes affect the way a History object is created from a HistDbEntry object and provide a more structured manner to set the properties of the History object during conversion.

atuin-client/src/import/resh.rs

Overall, the code looks good and well-structured. However, there are a few potential issues and improvements that can be made.

Error handling: In a few places, the code simply skips over invalid data (invalid UTF-8 or JSON) using continue. This might be intended, but it would be helpful to add logging or reporting of the skipped data to help diagnose issues in the future. You can use a logger like log or tracing to report these errors without necessarily stopping the whole process:

Err(e) => {
    log::warn!("Invalid UTF-8 encountered: {:?}", e); // Or use tracing::warn!()
    continue;
}

In the load function, clippy's cast_possible_truncation and cast_sign_loss lints are being ignored with #[allow(clippy::cast_possible_truncation)] and #[allow(clippy::cast_sign_loss)]. It's better to handle these cases in a safe way, for example, by saturating or checking the casts:

let secs = entry.realtime_before.floor() as i64;
let nanosecs = (entry.realtime_before.fract() * 1_000_000_000_f64).round() as u32;

You can use .saturating_* methods and make sure not to lose any data during casting. For example:

let secs = entry.realtime_before.floor().min(i64::MAX as f64) as i64;
let nanosecs = (entry.realtime_before.fract() * 1_000_000_000_f64).round().min(u32::MAX as f64) as u32;

In the comment // CHECK: should we add uuid here? It's not set in the other importers, it depends on the use case and whether it's important for the Resh importer to have a UUID for each entry. If UUID is required, you can add the uuid field using .uuid().
For better maintainability, you could break up the load method into smaller functions. Each smaller function would handle individual parts of the process, such as parsing the timestamp and calculating the duration.
Consider using more descriptive variable names for variables like b and s. It may not be clear what these variables represent, especially for developers who are new to the codebase.
In the default_histpath function, the eyre! macro is used to handle the error case. Consistently using a single approach like eyre! helps maintain readability throughout the codebase. However, consider logging the error before returning it from the function. This will provide better visibility into what went wrong when an error occurs:

let user_dirs = UserDirs::new().ok_or_else(|| {
    let err = eyre!("could not find user directories");
    log::error!("{:?}", err);
    err
})?;

With these suggestions, the code should be more robust, maintainable, and provide better diagnostics when handling errors.

The key change in this patch is the replacement of the manual construction of a History struct with the History::import() builder pattern. This new implementation allows for improved readability and maintainability.

The previous implementation was directly creating a History struct and pushing it into the loader:

h.push(History {
    id: ...,
    timestamp: ...,
    // Other fields
})
.await?;

The new implementation uses the History::import() builder pattern to fill in the fields while retaining the logical structure:

let imported = History::import()
    .command(entry.cmd_line)
    .timestamp(timestamp)
    .duration(duration)
    .exit(entry.exit_code)
    .cwd(entry.pwd)
    .hostname(entry.host)
    .session(uuid_v7().as_simple().to_string());
h.push(imported.build().into()).await?;

This change makes the code more readable and easier to modify, as each field is now added using a separate method call rather than within a single constructor.

atuin-client/src/import/zsh.rs

The code looks mostly clean and well-structured, but there are a few points worth noting regarding best practices, error handling, and documentation:

It appears that the code is working with Zsh shell history files, but this is not explicitly documented. Adding a doc-comment at the beginning of the module would be helpful for future users or developers.
In the default_histpath() function, consider using fs::read_dir(home_dir) with filter_map to search for the history files instead of the loop. This would allow you to remove the loop and make the code more concise.
The error created in the default_histpath() function says "Try setting and exporting $HISTFILE". It would be useful to elaborate a bit on how to set this environment variable to guide the user further.
Error messages should be more descriptive for better understanding: a. In the async new() function, add context about the error when reading bytes from the file with f.read_to_end(&mut bytes).context("Failed to read history file")? b. In the parse_extended() function, instead of using unwraps, use .ok_or_else() with a meaningful context message.
parse_extended() could benefit from documentation, explaining what it does and the meaning of its parameters.
The test_parse_extended_simple() and test_parse_file() tests seem to be well-structured and sufficient, though adding more tests that cover edge cases can help improve the module's reliability.
Finally, remove any unnecessary comments and ensure that the remaining ones explain what the code does or why it is necessary.

Overall, the code appears to be logically sound and functional but could be refined with proper error handling and documentation for better maintainability and usability.

The provided patch makes the following key changes to the code:

In the Zsh struct implementation, the push() method for the Impl Loader has been updated. Instead of using History::new(), the patch now uses History::import() along with the .timestamp() and .command() methods to create an imported object.
In the parse_extended() function, the creation of a History object has also been replaced with the usage of the History::import() method. It now sets the timestamp, command, and duration using the .timestamp(), .command(), and .duration() methods respectively. This improves the readability and maintainability of the code.

Overall, the changes in the patch focus on updating and simplifying the creation of History objects in the mentioned functions.

atuin-client/src/import/zsh_histdb.rs

There aren't any major issues in the provided code, but a few small improvements can be made:

In the ZshHistDb::histpath_candidate() function, the unwrap() should be replaced with proper error handling:

pub fn histpath_candidate() -> Result<PathBuf> {
    if let Some(user_dirs) = UserDirs::new() {
        let home_dir = user_dirs.home_dir();
        Ok(std::env::var("HISTDB_FILE")
            .as_ref()
            .map(|x| Path::new(x).to_path_buf())
            .unwrap_or_else(|_err| home_dir.join(".histdb/zsh-history.db")))
    } else {
        Err(eyre!("Failed to find user directories"))
    }
}

In hist_from_db() function, use expect() instead of unwrap() to provide a meaningful error message:

async fn hist_from_db(dbpath: PathBuf) -> Result<Vec<HistDbEntry>> {
    let pool = SqlitePool::connect(dbpath.to_str().expect("Invalid dbpath")).await?;
    hist_from_db_conn(pool).await
}

In the conversion from HistDbEntry to History, instead of using unwrap_or_else, consider using unwrap_or_default:

impl From<HistDbEntry> for History {
    fn from(histdb_item: HistDbEntry) -> Self {
        let imported = History::import()
            // ...
            .command(
                String::from_utf8(histdb_item.argv)
                    .unwrap_or_default()
                    .trim_end()
                    .to_string(),
            )
            // ...
    }
}

The comment line println!("{i:?}"); in the test test_import should be println!("{:?}", i);.
To avoid surprises, you can use std::env::var_os() instead of std::env::var() in histpath_candidate(). This gives you an Option<OsString> that you can convert to PathBuf directly, without needing to check for invalid Unicode:

pub fn histpath_candidate() -> Result<PathBuf> {
    if let Some(user_dirs) = UserDirs::new() {
        let home_dir = user_dirs.home_dir();
        Ok(std::env::var_os("HISTDB_FILE")
            .map(|x| Path::new(&x).to_path_buf())
            .unwrap_or_else(|| home_dir.join(".histdb/zsh-history.db")))
    } else {
        Err(eyre!("Failed to find user directories"))
    }
}

Other than these minor improvements, the code looks good.

The key changes in the patch involve modifying the From<HistDbEntry> implementation for the History struct:

Instead of using the History::new() function, the code now starts with the History::import() function to create a new HistoryBuilder instance.
The various fields of the History struct are now set using the builder pattern, by chaining methods like timestamp, command, cwd, duration, and hostname.
No assumption on exit status is made, as at this point in the API chain it defaults properly to 0.
After constructing the History object using the builder pattern, the build() method is called to finalize the HistoryBuilder into a History instance.
The History instance is then converted into the desired type using the into() method.

atuin-client/src/message.rs

Below is the reviewed code with comments provided for potential issues:

pub struct Message {
    pub id: Uuid, // No issues here, assuming the Uuid library has been imported.
    pub type: String, // Issue here - 'type' is a reserved keyword in Rust and cannot be used as a variable name.
}

To fix the issue, change the name of the type field to something else. For example, you could change it to msg_type:

pub struct Message {
    pub id: Uuid,
    pub msg_type: String, // Renamed from 'type' to 'msg_type'
}

Also, make sure to add the necessary import for the Uuid type at the beginning of the file:

use uuid::Uuid;

The patch removes the entire Message struct definition, which consisted of two fields: id of type Uuid, and type of type String. The key change in this patch is essentially the deletion of the Message struct.

atuin-server/src/handlers/history.rs

Here are a few potential issues that I noticed in the provided code:

Unwrapping an Option unsafely. In the list function, there is a line where you unwrap a result of type Result<Vec<...>, _>. It is generally better to handle the error case explicitly, rather than using unwrap(), because it could lead to a panic at runtime.

// Consider replacing the following two lines
let history: Vec<String> = history
    .unwrap()
    .iter()
    .map(|i| i.data.to_string())
    .collect();

// It's better to use something like this
let history = match history {
    Ok(hist) => hist.into_iter().map(|i| i.data.to_string()).collect(),
    Err(_) => {
        return Err(ErrorResponse::reply("failed to load history")
            .with_status(StatusCode::INTERNAL_SERVER_ERROR));
    }
};

Inconsistent error handling In the add function, you return an ErrorResponseStatus with a message and a StatusCode::INTERNAL_SERVER_ERROR. However, in the other functions, you use ErrorResponse::reply() with a with_status() method to accomplish the same thing. You may want to make the error handling consistent throughout the code by using the same method.
Error messages in the code are written in lowercase. It is a good practice to start error messages with uppercase letters. You may want to change "failed to query history count" to "Failed to query history count".
Function arguments are not consistently named. In the calendar function, the first argument is named focus, and in the body, you create another variable named focus. It would be better to give these two different names to avoid confusion, for example, you can use focus_name for the function argument and keep focus as the variable.
Safeguard against bad input In the calendar function, you are getting the user input for year and month from a HashMap. This method could lead to bad input. It is better to have some validation on the input parameters to make sure the values you get from the HashMap are correct or at least within a valid range.

Overall, the code looks quite clean and easy to read. These suggestions should help improve the error handling and consistency even further.

The key change in this patch is the removal of the explicit type annotation Vec<NewHistory> when defining the history variable. Instead, the type is now inferred using the wildcard type _. This change allows the Rust compiler to determine the appropriate type for the variable, making the code more concise and potentially easier to read.

atuin-server/src/models.rs

Overall, the given code is clear and easy to read. However, there are a few potential issues and suggestions for improvement:

It would be better to add comments or docstrings to the top of the structs to explain their purposes. This will help other developers understand the code more easily.
For the History and NewHistory structs, it may be worth considering changing the data field type to Vec<u8> instead of String, as this field is encrypted and might not necessarily have valid UTF-8 bytes. It may function correctly as a String, but upon decryption, the parsed String may not represent the correct encoded data.
The structures User and NewUser contain sensitive information, such as passwords. You should consider storing the password using a secure hash function like bcrypt or Argon2. This would also require changing the password field's type to better represent the stored hash.
The Session struct likely represents a user's session, which could contain sensitive information. Consider adding a timestamp for when the session was created and an expiration date to ensure that sessions are not used indefinitely. This would help in cases where a user's session token is leaked or compromised.

Here's the modified code considering the mentioned suggestions:

use chrono::prelude::*;

/// Represents a history record in the database
#[derive(sqlx::FromRow)]
pub struct History {
    pub id: i64,
    pub client_id: String,
    pub user_id: i64,
    pub hostname: String,
    pub timestamp: NaiveDateTime,
    /// Encrypted data about the command
    pub data: Vec<u8>,
    pub created_at: NaiveDateTime,
}

/// Represents a new history record to be inserted into the database
pub struct NewHistory {
    pub client_id: String,
    pub user_id: i64,
    pub hostname: String,
    pub timestamp: chrono::NaiveDateTime,
    /// Encrypted data about the command
    pub data: Vec<u8>,
}

/// Represents a user in the database
#[derive(sqlx::FromRow)]
pub struct User {
    pub id: i64,
    pub username: String,
    pub email: String,
    pub password_hash: String,
}

/// Represents a user's session in the database
#[derive(sqlx::FromRow)]
pub struct Session {
    pub id: i64,
    pub user_id: i64,
    pub token: String,
    pub created_at: NaiveDateTime,
    pub expires_at: NaiveDateTime,
}

/// Represents a new user to be inserted into the database
pub struct NewUser {
    pub username: String,
    pub email: String,
    pub password_hash: String,
}

/// Represents a new session to be inserted into the database
pub struct NewSession {
    pub user_id: i64,
    pub token: String,
    pub created_at: NaiveDateTime,
    pub expires_at: NaiveDateTime,
}

Please note that the suggested changes above, mainly related to user authentication and session management, must be handled accordingly in other parts of your application logic.

The key change in this patch is the addition of comments describing the data field in both the History and NewHistory structs. The comments explain that the data field stores encrypted data of a command and that currently it is an encrypted msgpack object, but this may change in the future. These comments provide context for developers when working with these structs and highlight the possibility of changing the format in which the data is stored.

atuin/src/command/client/history.rs

The source code appears to be correctly formatted and adheres to Rust's coding style guidelines. However, there are some suggestions to improve the code.

In the handle_list function, the match block creates many strings that are only used once. You can refactor this code to avoid unnecessary string allocations.

let query = match (session, cwd) {
    (None, None) => None,
    (None, Some(cwd)) => Some(format!("select * from history where cwd = '{.cwd}';", cwd=cwd)),
    (Some(session), None) => Some(format!("select * from history where session = '{.session}';", session=session)),
    (Some(session), Some(cwd)) => Some(format!("select * from history where cwd = '{.cwd}' and session = '{.session}';", cwd=cwd, session=session)),
};

let history = match query {
    None => db.list(settings.filter_mode, &context, None, false).await?,
    Some(query) => db.query_history(&query).await?,
};

In the print_list function, when handling the error case in the writeln! statement, consider replacing the match statement with more idiomatic if let statements.

if let Err(e) = writeln!(w, "{}", parsed_fmt.with_args(&FmtHistory(h))) {
    // ignore broken pipe (issue #626)
    if e.kind() == io::ErrorKind::BrokenPipe {
        return;
    }
    eprintln!("ERROR: History output failed with the following error: {err}");
    std::process::exit(1);
}

Do this for the w.flush() handler as well.

Make sure to escape braces correctly in the error message in the parse_fmt function:

println!("If your formatting string contains curly braces (e.g.: \"{{var}}\") you need to escape them this way: \"{{{{var}}}}\".");

Overall, the code appears well-structured and well-documented. These suggestions should help improve the code quality and readability slightly.

This patch primarily refactors the run function in the Cmd implementation by breaking it up into smaller, more focused functions. The key changes are:

It replaces the large match block within the run function with calls to new functions, each handling a specific command: handle_start, handle_end, and handle_list. This improves readability and maintainability by separating the code for different subcommands into distinct functions.
The handle_start function is responsible for handling the Cmd::Start variant. It checks and filters command history according to user settings, constructs the History object, and saves it to the database.
The handle_end function handles the Cmd::End variant. It updates the selected history entry with the correct exit status and duration, and runs synchronization if necessary depending on user settings.
The handle_list function is responsible for handling the Cmd::List variant. It constructs the history query, retrieves the history data from the database, and calls the print_list function to display the history according to the specified format and flags.

Minor changes include slight alterations in how function arguments and variables are used or passed, but overall the code structure and functionality remain the same.

cc https://github.com/ellie/atuin/pull/933

juntao commented 1 year ago

flows summarize

juntao commented 1 year ago

Hello, I am a code review bot on flows.network. Here are my reviews of code commits in this PR.

Overall Summary:

The recent changes in the provided GitHub pull request summaries involve modifications to the History struct, implementation of builders for constructing History objects, improvement in code organization for better readability, and updates to the documentation.

Potential Issues and Errors:

Backwards compatibility issue: Old clients might not be able to deserialize the new History objects due to the new env field.
The data format for encrypted data might change in the future, breaking compatibility with the currently encrypted msgpack objects.
The introduction of builders may increase the complexity of the codebase, making it potentially more challenging for new contributors to understand.
It may be beneficial to add tests assessing whether the builders are working correctly and ensure there is 100% code-coverage for this functionality.
The added abstraction could impact the performance of the code. It would be beneficial to profile the application and identify any performance bottlenecks introduced by these changes.
Thorough testing should be conducted to ensure the new functions in the Cmd::run function split perform as expected.
Since the newly added functions are part of the Cmd implementation, their visibility is restricted to the module they belong to. If required to be called by other modules, they may need to be marked as pub.

Main findings:

Removed the HistoryWithoutDelete struct and associated code, added env field to History struct, and deleted message.rs.
The patch is focused on using builders to create History objects and ensuring that all required fields are set for the given construction scenario at the compile-time.
This patch splits the Cmd::run function into three separate functions for better code organization and readability.
The documentation updates inform users about three ways to create a new history entry.

Details

Commit d96a0bf15c260ccacfaa5b7dadbd65f9f9963de2

Summary of Key Changes:

Removed the HistoryWithoutDelete struct and associated code.
Added env field to History struct.
Deleted message.rs.
Updated history.rs and added a test module for History struct.
Updated atuin-server/src/handlers/history.rs and atuin-server/src/models.rs for handling the encrypted data field.

Potential Problems:

Backwards compatibility issue: Old clients might not be able to deserialize the new History objects due to the new env field.
The data format for encrypted data might change in the future, breaking compatibility with the currently encrypted msgpack objects.

Commit 3d909fdfb2bffd93f0ba639a95c411959445a595

This GitHub patch is focused on using builders to create History objects and ensuring that all required fields are set for the given construction scenario at the compile-time. Key changes in the patch include:

Adding a dependency: The typed-builder dependency is added to Cargo.lock and atuin-client/Cargo.toml.
Introducing HistorySeal: A new marker type called HistorySeal is added to prevent the History struct from being constructed directly.
Creating a new module builder: This module contains builders for HistoryImported, HistoryCaptured, and HistoryFromDb.
Changing constructors: The History constructors are modified to use the new builders.
Updating code: Existing code in the atuin-client and atuin modules has been updated to use the new builders.

Potential problems or areas for improvement:

Code readability: The introduction of builders may increase the complexity of the codebase, making it potentially more challenging for new contributors to understand.
Test coverage: It may be beneficial to add tests assessing whether the builders are working correctly and ensure there is 100% code-coverage for this functionality.
Performance impact: The added abstraction could impact the performance of the code. It would be beneficial to profile the application and identify any performance bottlenecks introduced by these changes.

Commit 49a839145105d4406732a01845cf1c5f58ad0a8e

This patch splits the Cmd::run function into three separate functions for better code organization and readability. The three new functions are handle_start, handle_end, and handle_list, which correspond to the different subcommands of the program.

Key changes:

Addition of handle_start function for handling Start command.
- Takes the db, settings, and command as parameters and performs the original Start command operations.
Addition of handle_end function for handling End command.
- Takes the db, settings, id, and exit as parameters and performs the original End command operations.
Addition of handle_list function for handling List command.
- Takes the db, settings, context, session, cwd, mode, and format as parameters and performs the original List command operations.
Modification of Cmd::run function to call the new handle_* functions based on the command.

Overall, the patch seems well-organized and maintains the original functionality. However, there are some points to consider:

Thorough testing should be conducted to ensure the new functions perform as expected.
Since the newly added functions are part of the Cmd implementation, their visibility is restricted to the module they belong to. If required to be called by other modules, they may need to be marked as pub.

Commit 2db3f93dc36c12cc5b9e084a0d9861167ee3aaaa

Summary of Key Changes:

This patch updates the History documentation in atuin-client/src/history.rs.
The documentation now informs users about three ways to create a new history entry: a. History::import(): import an entry from the shell history file b. History::capture(): capture an entry via hook c. History::from_db(): create an instance from the database entry

Potential Problems:

I do not see any potential problems with this patch. It is a documentation update, and the added information is useful for developers who want to understand how to create a new history entry.

cc https://github.com/ellie/atuin/pull/933

flows-network / review-any-pr-with-chatgpt