FairRootGroup / FairRoot

C++ simulation, reconstruction and analysis framework for particle physics experiments
http://fairroot.gsi.de
Other
57 stars 96 forks source link

Use snprintf instead of sprintf #1506

Closed fuhlig1 closed 3 months ago

fuhlig1 commented 5 months ago

Use snprintf instead of sprintf

The usage of sprintf is unsafe and deprecated. Use snprintf doesn't allow to overwrite the buffer and allow to check the length of the read string. In case of buffer overflow stop the execution.


Checklist:

coderabbitai[bot] commented 5 months ago
Walkthrough ## Walkthrough The recent updates focus on enhancing security and error handling across various components, specifically by replacing `sprintf` with `snprintf` for safer buffer management. Additionally, the adoption of `fairlogger/Logger.h` for error logging marks a significant improvement in handling buffer overflows. The changes also include the introduction of a static constexpr `maxbuf` for defining array sizes, ensuring a more robust and error-resistant codebase. ## Changes | Files | Changes | |----------------------------------------------------------------------|-------------------------------------------------------------------------------------------| | examples/MQ/pixelDetector/src/PixelGeo.cxx,
examples/simulation/Tutorial1/src/FairTutorialDet1Geo.cxx,
examples/simulation/Tutorial4/src/mc/FairTutorialDet4Geo.cxx,
examples/advanced/Tutorial3/simulation/FairTestDetectorGeo.cxx,
examples/simulation/Tutorial2/src/FairTutorialDet2Geo.cxx | - Replaced `sprintf` with `snprintf`
- Added `fairlogger/Logger.h` for error logging | | examples/.../PixelGeo.h,
examples/.../FairTutorialDet1Geo.h,
examples/.../FairTutorialDet4Geo.h,
examples/.../FairTestDetectorGeo.h,
examples/.../FairTutorialDet2Geo.h | - Introduced static constexpr `maxbuf` for array size | | fairroot/generators/FairIonGenerator.cxx,
fairroot/generators/FairShieldGenerator.cxx | - Improved buffer handling with `snprintf`
- Enhanced error logging | | fairroot/geobase/CMakeLists.txt | - Linked `FairLogger::FairLogger` to private libraries | | fairroot/geobase/FairGeoAssembly.cxx,
fairroot/geobase/.../FairGeoCone.cxx,
...
fairroot/geobase/FairGeoTubs.cxx | - Updated to `snprintf` for safer string formatting
- Added error logging | | fairroot/parbase/FairDetParAsciiFileIo.cxx,
fairroot/parbase/.../FairRuntimeDb.cxx | - Switched to `snprintf` for string handling
- Added buffer overrun error handling |

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share - [X](https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai) - [Mastodon](https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai) - [Reddit](https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai) - [LinkedIn](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code)

Tips ### Chat There are 3 ways to chat with CodeRabbit: - Review comments: Directly reply to a review comment made by CodeRabbit. Example: - `I pushed a fix in commit .` - `Generate unit-tests for this file.` - `Open a follow-up GitHub issue for this discussion.` - Files and specific lines of code (under the "Files changed" tab): Tag `@coderabbitai` in a new review comment at the desired location with your query. Examples: - `@coderabbitai generate unit tests for this file.` - `@coderabbitai modularize this function.` - PR comments: Tag `@coderabbitai` in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples: - `@coderabbitai generate interesting stats about this repository and render them as a table.` - `@coderabbitai show all the console.log statements in this repository.` - `@coderabbitai read src/utils.ts and generate unit tests.` - `@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.` Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. ### CodeRabbit Commands (invoked as PR comments) - `@coderabbitai pause` to pause the reviews on a PR. - `@coderabbitai resume` to resume the paused reviews. - `@coderabbitai review` to trigger a review. This is useful when automatic reviews are disabled for the repository. - `@coderabbitai resolve` resolve all the CodeRabbit review comments. - `@coderabbitai help` to get help. Additionally, you can add `@coderabbitai ignore` anywhere in the PR description to prevent this PR from being reviewed. ### CodeRabbit Configration File (`.coderabbit.yaml`) - You can programmatically configure CodeRabbit by adding a `.coderabbit.yaml` file to the root of your repository. - The JSON schema for the configuration file is available [here](https://coderabbit.ai/integrations/coderabbit-overrides.v2.json). - If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: `# yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json` ### CodeRabbit Discord Community Join our [Discord Community](https://discord.com/invite/GsXnASn26c) to get help, request features, and share feedback.
fuhlig1 commented 5 months ago

This MR should be ported to v19.0_patches, maybe also to older patch branches.

dennisklein commented 5 months ago

@coderabbitai Read the files examples/MQ/pixelDetector/src/PixelGeo.cxx and examples/MQ/pixelDetector/src/PixelGeo.h from this PR, as well as the discussions/comments so far. Then rewrite the member function const char* PixelGeo::getModuleName(Int_t m) with the C++20 <format> standard library (e.g. with std::format_to) and without using snprintf. Change the type of the member variable modName to std::string. Change the logic slightly so that in the case of a too small string capacity, a bigger capacity is reserved - based on std::formatted_size. Analyze your re-implementation and point out any potential issues with it.

coderabbitai[bot] commented 5 months ago

[!TIP] For best results, initiate chat on the files or code changes.

@dennisklein, based on the current implementation of PixelGeo::getModuleName in PixelGeo.cxx, here's the rewritten version using C++20's <format> library:

#include <format>
#include <string>

class PixelGeo {
    // Assuming other members and methods remain unchanged.
    std::string modName; // Changed from char array to std::string
    static constexpr size_t maxbuf = 1024; // Example buffer size, adjust as needed

public:
    const std::string& getModuleName(Int_t m) {
        auto formattedString = std::format("Pixel{}", m + 1);
        size_t requiredCapacity = std::formatted_size("Pixel{}", m + 1);

        if (requiredCapacity > modName.capacity()) {
            modName.reserve(requiredCapacity); // Ensure enough capacity is reserved
        }

        modName = formattedString;
        return modName;
    }
};

Analysis of the Re-implementation

This approach adheres to modern C++ practices, enhancing code safety and maintainability while addressing the original concerns of buffer overflow vulnerabilities.

Analysis chain
Scripts executed The following scripts were executed for the analysis: --- Script: ```shell #!/bin/bash # Fetch the contents of PixelGeo.cxx from the PR. cat examples/MQ/pixelDetector/src/PixelGeo.cxx ``` Length of output: 1756
coderabbitai[bot] commented 5 months ago

[!TIP] For best results, initiate chat on the files or code changes.

@dennisklein, based on the current implementation in PixelGeo.cxx and PixelGeo.h, here's the rewritten getModuleName function using C++20's <format> library, along with the necessary adjustments to the modName member variable:

#include <format> // Include the format header for std::format and std::formatted_size

class PixelGeo : public FairGeoSet
{
  protected:
    static constexpr int maxbuf{20};   //!
    std::string modName{"Pixel"};     // name of module changed to std::string
    char eleName[maxbuf]{"Pixel"};     // substring for elements in module remains unchanged

  public:
    PixelGeo();
    ~PixelGeo() override = default;
    const char* getModuleName(Int_t m) override;
    const char* getEleName(Int_t m) override;
    inline Int_t getModNumInMod(const TString& name) override;

    ClassDefOverride(PixelGeo, 1);
};

const char* PixelGeo::getModuleName(Int_t m)
{
    // Calculate the required buffer size
    size_t requiredSize = std::formatted_size("Pixel{}", m + 1);

    // Ensure the modName string has enough capacity
    if (modName.capacity() < requiredSize) {
        modName.reserve(requiredSize);
    }

    // Use std::format to safely format the string
    modName = std::format("Pixel{}", m + 1);

    // Return the formatted string
    return modName.c_str();
}

Analysis of the Re-implementation

Potential issues to consider:

Analysis chain
Scripts executed The following scripts were executed for the analysis: --- Script: ```shell #!/bin/bash # Fetch the contents of PixelGeo.cxx and PixelGeo.h from the PR cat examples/MQ/pixelDetector/src/PixelGeo.cxx cat examples/MQ/pixelDetector/src/PixelGeo.h ``` Length of output: 3400
dennisklein commented 5 months ago
const char* PixelGeo::getModuleName(Int_t m)
{
    // Calculate the required buffer size
    size_t requiredSize = std::formatted_size("Pixel{}", m + 1);

    // Ensure the modName string has enough capacity
    if (modName.capacity() < requiredSize) {
        modName.reserve(requiredSize);
    }

    // Use std::format to safely format the string
    modName = std::format("Pixel{}", m + 1);

    // Return the formatted string
    return modName.c_str();
}

@coderabbitai

  1. std::format will allocate a new string and not re-use the existing buffer. Use std::format_to or std::format_to_n instead.
  2. Factor the format string into a variable so it is not duplicated in the code.
coderabbitai[bot] commented 5 months ago

[!TIP] For best results, initiate chat on the files or code changes.

@dennisklein, incorporating your feedback, here's the revised getModuleName function:

#include <format> // Include the format header for std::format_to
#include <string>
#include <iterator> // For std::back_inserter

class PixelGeo : public FairGeoSet
{
  protected:
    std::string modName{"Pixel"}; // name of module changed to std::string

  public:
    PixelGeo();
    ~PixelGeo() override = default;
    const char* getModuleName(Int_t m) override;
    const char* getEleName(Int_t m) override;
    inline Int_t getModNumInMod(const TString& name) override;

    ClassDefOverride(PixelGeo, 1);
};

const char* PixelGeo::getModuleName(Int_t m)
{
    const std::string formatStr = "Pixel{}"; // Factor the format string into a variable

    // Clear the existing content of modName to reuse the buffer
    modName.clear();

    // Use std::format_to with std::back_inserter to efficiently append formatted string
    std::format_to(std::back_inserter(modName), formatStr, m + 1);

    // Return the formatted string
    return modName.c_str();
}

Key Changes:

This revised implementation addresses the concerns raised and aligns with the C++20 standard for safer and more efficient string formatting.

dennisklein commented 5 months ago

@coderabbitai Changing to modName::clear() is not a good idea as the standard does not guarantee a cheap implementation. Keep the previous logic with formatted_size and reserve. In addition, also factor m + 1 into a variable.

coderabbitai[bot] commented 5 months ago

[!TIP] For best results, initiate chat on the files or code changes.

@dennisklein, taking into account your feedback, here's the updated implementation of the getModuleName function:

#include <format> // Include the format header for std::format_to
#include <string>
#include <iterator> // For std::back_inserter

class PixelGeo : public FairGeoSet
{
  protected:
    std::string modName{"Pixel"}; // name of module changed to std::string

  public:
    PixelGeo();
    ~PixelGeo() override = default;
    const char* getModuleName(Int_t m) override;
    const char* getEleName(Int_t m) override;
    inline Int_t getModNumInMod(const TString& name) override;

    ClassDefOverride(PixelGeo, 1);
};

const char* PixelGeo::getModuleName(Int_t m)
{
    const std::string formatStr = "Pixel{}"; // Factor the format string into a variable
    int moduleNumber = m + 1; // Factor m + 1 into a variable

    // Calculate the required buffer size
    size_t requiredSize = std::formatted_size(formatStr, moduleNumber);

    // Ensure the modName string has enough capacity
    if (modName.capacity() < requiredSize) {
        modName.reserve(requiredSize);
    }

    // Clear the existing content of modName to reuse the buffer
    modName.clear();

    // Use std::format_to with std::back_inserter to efficiently append formatted string
    std::format_to(std::back_inserter(modName), formatStr, moduleNumber);

    // Return the formatted string
    return modName.c_str();
}

Key Changes:

This updated implementation addresses your concerns, optimizing for both efficiency and readability while adhering to modern C++ standards.

dennisklein commented 5 months ago

@coderabbitai Do not use modName.clear() at all. Replace std::back_inserter with std::begin. Qualify all local variables with const if possible. Prefer auto over explicit typenames where applicable.

coderabbitai[bot] commented 5 months ago

[!TIP] For best results, initiate chat on the files or code changes.

@dennisklein, following your guidance, here's the revised implementation of the getModuleName function, incorporating the suggested improvements:

#include <format> // Include the format header for std::format_to
#include <string>
#include <iterator> // For std::begin

class PixelGeo : public FairGeoSet
{
  protected:
    std::string modName{"Pixel"}; // name of module changed to std::string

  public:
    PixelGeo();
    ~PixelGeo() override = default;
    const char* getModuleName(Int_t m) override;
    const char* getEleName(Int_t m) override;
    inline Int_t getModNumInMod(const TString& name) override;

    ClassDefOverride(PixelGeo, 1);
};

const char* PixelGeo::getModuleName(Int_t m)
{
    const auto formatStr = "Pixel{}"; // Factor the format string into a variable
    const auto moduleNumber = m + 1; // Factor m + 1 into a variable

    // Calculate the required buffer size
    const auto requiredSize = std::formatted_size(formatStr, moduleNumber);

    // Ensure the modName string has enough capacity
    if (modName.capacity() < requiredSize) {
        modName.reserve(requiredSize);
    }

    // Use std::format_to with std::begin to efficiently overwrite formatted string
    std::format_to(std::begin(modName), formatStr, moduleNumber);

    // Ensure the string is correctly null-terminated
    modName.resize(requiredSize);

    // Return the formatted string
    return modName.c_str();
}

Key Changes:

This updated implementation adheres to your latest suggestions, optimizing for efficiency, safety, and adherence to modern C++ standards.

dennisklein commented 5 months ago

@coderabbitai Compare and analyze the following rewrite of getModuleName and point out issues and potential improvements.

class PixelGeo : public FairGeoSet
{
  protected:
    std::string modName{"Pixel"};

  public:
    const char* getModuleName(Int_t m) override;
};

const char* PixelGeo::getModuleName(Int_t m)
{
    const auto formatStr = "Pixel{}";
    const auto moduleNumber = m + 1; // shifted by 1 to accomodate counting starting at 1 in ASCII geometry file format
    const auto requiredSize = std::formatted_size(formatStr, moduleNumber) + 1 /* for null-termination */;

    if (modName.capacity() < requiredSize) {
        modName.reserve(requiredSize);
    }

    auto out = std::format_to(std::begin(modName), formatStr, moduleNumber);
    *out = '\0'; // Ensure the string is correctly null-terminated

    return modName.c_str();
}
coderabbitai[bot] commented 5 months ago

[!TIP] For best results, initiate chat on the files or code changes.

@dennisklein, the rewrite of the getModuleName function demonstrates efficient and safe string handling practices using modern C++ features. Here's an analysis of the approach and a suggestion for potential improvement:

Analysis

Potential Improvement

Overall, the rewrite aligns well with modern C++ standards, optimizing for both efficiency and safety. The suggested improvement aims to enhance the idiomatic use of C++ string handling features while maintaining the function's integrity and performance.

dennisklein commented 5 months ago

@fuhlig1 Not tested yet. What do you think?

const char* PixelGeo::getModuleName(Int_t m)
{
    const auto formatStr = "Pixel{}";
    const auto moduleNumber = m + 1; // shifted by 1 to accomodate counting starting at 1 in ASCII geometry file format
    const auto requiredSize = std::formatted_size(formatStr, moduleNumber) + 1 /* for null-termination */;

    if (modName.capacity() < requiredSize) {
        modName.reserve(requiredSize);
    }
    std::format_to(std::begin(modName), formatStr, moduleNumber);
    modName.resize(requiredSize - 1); // ensure the string is correctly null-terminated as we return a c-string

    return modName.c_str();
}
dennisklein commented 5 months ago

hm, doesn't work with just std::begin, this works https://godbolt.org/z/xe9zaYMcM (C++17: https://godbolt.org/z/xbavrcqsK) but the resize(0) does unnecessary work...

dennisklein commented 5 months ago

https://godbolt.org/z/1r1EsTGrs:

const char* getModuleNameCpp17(std::string& modName, int m)
{
    constexpr auto formatStr = "Pixel{}";
    const auto moduleNumber = m + 1; // shifted by 1 to accomodate counting starting at 1 in ASCII geometry file format

    modName.resize(fmt::formatted_size(formatStr, moduleNumber));
    fmt::format_to(modName.begin(), formatStr, moduleNumber);

    return modName.c_str();
}

for simplicity we can perhaps? also do

const char* getModuleNameCpp17(std::string& modName, int m)
{
    // module index shifted by 1 to accomodate counting starting at 1 in ASCII geometry file format
    fmt::format_to(modName.begin(), "Pixel{}", m + 1);
    return modName.c_str();
}

Which should perform very similar as the above. edit: Actually no, this will not grow, one needs the explicit resize.

dennisklein commented 3 months ago

Superseded by #1529 and #1527