InsightSoftwareConsortium / ITK

Insight Toolkit (ITK) -- Official Repository. ITK builds on a proven, spatially-oriented architecture for processing, segmentation, and registration of scientific images in two, three, or more dimensions.
https://itk.org
Apache License 2.0
1.37k stars 660 forks source link

String manipulation security: Remove snprintf with user defined formatting options. #4621

Open hjmjohnson opened 2 months ago

hjmjohnson commented 2 months ago

Description

Allowing users to specify formatting strings at runtime is a well known exploitable code security vulnerability.

We currently suppress these warnings, but it would be better to re-write the codebase to avoid the security vulnerability all together.

Steps to Reproduce

    ITK_GCC_PRAGMA_PUSH
    ITK_GCC_SUPPRESS_Wformat_nonliteral
    snprintf(fileName, IOCommon::ITK_MAXPATHLEN + 1, m_SeriesFormat.c_str(), fileNumber);
    ITK_GCC_PRAGMA_POP

Expected behavior

No warning suppression and no security vulnerability.

Actual behavior

When ITK_GCC_SUPPRESS_Wformat_nonliteral supression are disabled, warnings are issued.

Reproducibility

New compilers, and requesting -Wformat-nonliteral

Versions

Since the earliest versions of ITK to at least 2024-04-29

Additional Information

https://github.com/InsightSoftwareConsortium/ITK/pull/4616#discussion_r1583057823

dzenanz commented 2 months ago

Removing this feature is likely to just push it into the application code, for the applications that already use it. And it is a non-issue for most applications. Leaving it in, makes it easy for applications to expose this security vulnerability to the end-user. But how much of an issue is this? I think this issue should be a low priority.

N-Dekker commented 2 months ago

This issue only applies to m_SeriesFormat, right? As far as I can see, that's the only case where the user is allowed to specify the formatting by a printf-like string.

The least we could do is add a warning/note to m_SeriesFormat and its setter.

In the future, we could maybe replace it with a C++20 std::format based implementation.

seanm commented 2 months ago

Removing this feature is likely to just push it into the application code, for the applications that already use it.

Depends, in user's application code, it might be a compile-time constant.

This issue only applies to m_SeriesFormat, right? As far as I can see, that's the only case where the user is allowed to specify the formatting by a printf-like string.

Yes, only one I see too. And the only warnings here too: https://open.cdash.org/viewBuildError.php?type=1&buildid=9579695

In the future, we could maybe replace it with a C++20 std::format based implementation.

How would that help?

N-Dekker commented 2 months ago

In the future, we could maybe replace it with a C++20 std::format based implementation.

How would that help?

Correction: I should say std::vformat, not std::format! std::vformat checks that the passed parameters match with the format string. It prevents hackers from accessing arbitrary memory through the format string.


Looking at https://github.com/InsightSoftwareConsortium/ITK/blob/487da6d086b341808147c71e8dd1eb1f57371ea6/Modules/IO/ImageBase/include/itkImageSeriesWriter.hxx#L143

snprintf(fileName, IOCommon::ITK_MAXPATHLEN + 1, m_SeriesFormat.c_str(), fileNumber);

Could then (when using C++20) be replaced with something like:

std::string fileName = std::vformat(m_SeriesFormatString, fileNumber);
seanm commented 2 months ago

Ah that looks promising. I guess it could even be done today, wrapped in #if __cplusplus > xyz and falling back to snprintf in the #else.

N-Dekker commented 2 months ago

Ah that looks promising. I guess it could even be done today, wrapped in #if __cplusplus > xyz and falling back to snprintf in the #else.

Well, ideally yes. But the format of std::format/std::vformat is different from the old printf/snprintf "%d" format. So it would be an API change 🤷 .

blowekamp commented 2 months ago

2 cents: For SimpleITK we did not expose these methods and pushed the generation or globing to scripting language. Other languages seem more expressive, and less error prone for these types of operations.

dzenanz commented 2 months ago

I guess that will be easier from C++ too, when C++20 is available.