LlamaEdge / rag-api-server

A RAG API server written in Rust following OpenAI specs
https://llamaedge.com/docs/user-guide/server-side-rag/quick-start
Apache License 2.0
31 stars 8 forks source link

Feature: Search capability #22

Open suryyyansh opened 2 months ago

suryyyansh commented 2 months ago

Pertaining WasmEdge #3504

This PR aims to add search capabilities to the RAG API Server. This functionality will be enabled through the search feature. It will also leverage the llamaedge-query-server to deliver nuanced search queries and provide the RAG API Server something to fall back on in case the RAG fails due to no RAG query matches.

juntao commented 2 months ago

Hello, I am a code review agent on flows.network. Here are my reviews of changed source code files in this PR.


Cargo.toml

Potential issues

  1. Issue: Version locking for all dependencies is not recommended as it can lead to incompatibilities and missed out on security updates. Consider using a version range or the latest compatible version for most dependencies, except for those you have explicitly patched (e.g., "socket2", "reqwest", "hyper", "tokio").

  2. Issue: The "llama-core" dependency is pinned to a specific version ("=0.17.0") and has the "logging" feature enabled, which might not be necessary or desirable in all deployment scenarios. Consider using a newer compatible version of "llama-core" if available, and make the usage of the "logging" feature optional or configurable based on the deployment environment.

Summary of changes

  1. Addition of the "search" feature: The main change is the addition of a new optional feature named "search". This may potentially allow for improved search functionality, but without more context or information about its purpose and implementation, it's hard to provide a detailed explanation.

  2. Updated Git branch reference for 'tokio': The git URL and branch used for the tokio dependency has been updated from an unspecified version to the "v1.3" branch of the "second-state/wasi_tokio" repository.

  3. No other major changes were detected in the patch, such as removal or modification of existing features.

src/backend/ggml.rs

Potential issues

Based on the given Rust code, I have identified three major coding issues that need to be addressed:

  1. Error Handling in models_handler function: The error handling for serializing the model list result is not wrapped within a match statement, which could lead to unhandled panics if the serialization fails.

  2. Error Handling in embeddings_handler and rag_query_handler functions: These two functions have similar issues where error handling for creating HTTP response objects is not properly handled inside a match statement. This can cause unhandled panics if the response creation fails.

  3. Error Handling in files_handler function: The error handling for reading buffer from request body using to_bytes function and converting it into Multipart object is not wrapped within a match statement, which could lead to unhandled panics if any of these operations fails.

Summary of changes

Key Changes:

  1. Added Conditional Compilation (#cfg) for "search" feature to selectively include/exclude sections of code based on the feature flag during compilation. This is likely a new functionality that can perform web searches when enabled.

  2. Introduced web_search_allowed variable under the "search" feature which will be set to true when no points are retrieved, thus enabling web search.

  3. Added logic to send a request to an external LlamaEdge query server for web searches when web_search_allowed is true and the "search" feature flag is enabled. This includes parsing query, building requests, making HTTP requests, handling responses, error checking, and injecting search summary into conversation context if the decision from query server is successful.

src/main.rs

Potential issues

Based on a review of the provided source code for an LlamaEdge-RAG API Server, I've identified three major coding issues:

  1. Error Handling and Logging Issue (Line 286): The error message being logged when parsing the socket address may not be clear to developers or users. It would be beneficial to provide a more descriptive error message that suggests potential solutions, such as "Invalid socket address format. Please use the IP:PORT syntax."

  2. Validation Missing for URL (Line 154): There is no validation check in place for the Qdrant REST API URL provided by the user through the command line arguments. Adding a URL validity check could prevent unnecessary errors and improve the overall reliability of the server.

  3. Server Configuration Issue (Line 208-214): The application does not validate whether the prompt template for chat models supports system messages while the RAG policy is set to "system_message." This can lead to runtime errors, especially if developers or users use chat models that do not support system messages in their templates. Adding a validation check here would ensure consistency and prevent unexpected behavior at runtime.

Summary of changes

Key Changes:

  1. Added a new module utils and imported it. This suggests that the project might include additional utility functions or types.

  2. Introduced a new feature flag "search". Under this flag, added imports for SearchArguments from the utils module and a static variable SEARCH_ARGUMENTS. These changes likely indicate the addition of search functionality to the project.

  3. Modified the CLI arguments by adding two new options: api-key, used to supply an API key for the endpoint, and query-server-url, which is a required URL for the LlamaEdge query server. Also added a new option search-backend with a default value "tavily" that requires the query-server-url. These changes indicate the extension of functionality to include internet searches.

src/utils.rs

Potential issues

  1. In the is_valid_url function, there's no error handling for when parsing a URL fails. It would be beneficial to have some way of indicating that an invalid URL was provided, such as by returning a Result instead of a boolean.

  2. The conversion from LogLevel::Critical to log::LevelFilter::Error in the implementation of From<LogLevel> for log::LevelFilter may not be intentional. It's worth reviewing this mapping to ensure it meets the project's requirements correctly.

  3. The SearchArguments struct uses a string (search_backend) to represent an enumeration of possible search backends. Using an enumeration would make the code safer and more maintainable, as it would prevent invalid values from being used at compile-time.

Summary of changes

  1. Addition of a new struct, "SearchArguments", which is used to handle additional parameters for internet search in the context of a feature called "search".
  2. Introduction of three new fields within "SearchArguments": 'api_key', 'query_server_url', and 'search_backend'. This enables support for different search backends and query servers.
  3. The 'gen_chat_id' function is unaffected by this change as it remains unchanged, aside from potential usage of the new "SearchArguments" struct if it's enabled with the feature flag.
juntao commented 2 months ago

@suryyyansh Please fix the CI failure. Thank you!

alabulei1 commented 2 months ago

Hi @suryyyansh

Could you please write documentation on how to use the search API server? Thanks. https://github.com/LlamaEdge/docs

Is there a GitHub repo called search-api-server? If so, could you please send us your repo link and add README.md about the project? Thanks.

apepkuss commented 2 months ago

@suryyyansh Could you please fix the conflicts? Thanks a lot.

apepkuss commented 1 month ago

@suryyyansh Could you please rebase this PR onto the latest code. Thanks a lot!