VirusTotal / yara

The pattern matching swiss knife
https://virustotal.github.io/yara/
BSD 3-Clause "New" or "Revised" License
7.93k stars 1.42k forks source link

linux.c: don't use pread64 #2048

Closed orbea closed 3 months ago

orbea commented 4 months ago

Starting with musl-1.2.4 all LFS64 APIs have been removed since musl is always 64-bit and yara now fails with implicit function declarations for pread64.

However configure.ac already sets AC_SYS_LARGEFILE so changing them all to pread will work with both glibc and musl.

google-cla[bot] commented 4 months ago

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

plusvic commented 4 months ago

This was explicitely changed in #2005. @vthib, could you take a look a this PR?

vthib commented 4 months ago

Yes, as explained in #2005, this change would break the "glibc, x86, LFS not enabled" case.

There is a bit in musl explaining a bit the differences with glibc here: https://wiki.musl-libc.org/faq#Q:-Do-I-need-to-define-%3Ccode%3E_LARGEFILE64_SOURCE%3C/code%3E-to-get-64bit-%3Ccode%3Eoff_t%3C/code%3E?

From what I can gather, we have:

The pread64 function is in posix for support of large files, but musl seems to want to deprecate this. This leaves us in a bit of an weird situation:

As detailed in #2005, the issue isn't so much when compiling yara using autotools (since it should set this properly), but rather when using other tools that compile yara, such as yara-rust. If it breaks, it is completely silent as it will still compile, and it can be very annoying to debug.

I wonder if the best solution couldn't be something like this:

orbea commented 4 months ago

Maybe I'm missing something, but shouldn't AC_SYS_LARGEFILE already add the required defines to config.h?

orbea commented 4 months ago

Maybe I'm missing something

Yes, it seems yara doesn't use AC_CONFIG_HEADERS so there is no config.h, but unfortunately adding it is not as trivial as I hoped due to how AC_DEFINE is used makes AC_CONFIG_HEADERS unhappy.

orbea commented 4 months ago

I force pushed with two new commits, the first to add a description to AC_DEFINE as shown in the autoconf documentation which seems required with AC_CONFIG_HEADERS and another to add AC_CONFIG_HEADERSso that pread should work on 32-bit systems too since AC_SYS_LARGEFILE is already used.

This needs testing to be sure config.h is not missing anywhere important.

vthib commented 4 months ago

Adding a config.h does not solve the points i raised about avoiding the use of the 32-bits pread function when autotools is not used.

@plusvic any opinions on the changes i suggested?

orbea commented 4 months ago

Adding a config.h does not solve the points i raised about avoiding the use of the 32-bits pread function when autotools is not used.

Is it possible to test it? Check out the documentation for AC_SYS_LARGEFILE.

https://www.gnu.org/software/autoconf/manual/autoconf-2.67/html_node/System-Services.html

— Macro: AC_SYS_LARGEFILE

    Arrange for 64-bit file offsets, known as [large-file support](http://www.unix-systems.org/version2/whatsnew/lfs20mar.html). On some hosts, one must use special compiler options to build programs that can access large files. Append any such options to the output variable CC. Define _FILE_OFFSET_BITS and _LARGE_FILES if necessary.

    Large-file support can be disabled by configuring with the --disable-largefile option.

    If you use this macro, check that your program works even when off_t is wider than long int, since this is common when large-file support is enabled. For example, it is not correct to print an arbitrary off_t value X with printf ("%ld", (long int) X).

    The LFS introduced the fseeko and ftello functions to replace their C counterparts fseek and ftell that do not use off_t. Take care to use AC_FUNC_FSEEKO to make their prototypes available when using them and large-file support is enabled.

These defines get added to config.h which is now included at the start of the relevant file, I think this is accomplishing what you are suggesting?

vthib commented 4 months ago

The issue that I raised is that if autotools are not used, it is very easy to forget to define _FILE_OFFSET_BITS, and this can lead to invalid process scanning.

The config.h is generated by autotools, so this actually reinforces the coupling with autotools. That would force people that do not use autotools (like yara-rust) to create a config.h by hand to still be able to build, and there is still the possibility of forgetting to define _FILE_OFFSET_BITS

orbea commented 4 months ago

Thanks for elaborating, I understand your concern now and yes another build system would have to replicate the defines and autoconf tests. I am not certain if setting _FILE_OFFSET_BITS unconditionally is correct? See this meson issue. https://github.com/mesonbuild/meson/issues/3519

Unfortunately this is not something easily solved outside of using autoconf regardless...

vthib commented 4 months ago

Regarding your link, it also leads to this: https://github.com/jarun/nnn/commit/db8b61866b1cc6dbbf05f83a7ca623f42eb9fcb7

And defining it in proc/linux.c would mean this would also impact linux builds, and no other targets. This feels very safe and recommended, both by musl and glibc

plusvic commented 4 months ago

I prefer not having to deal with a config.h file, and go with @vthib's proposal.

orbea commented 4 months ago

I prefer not having to deal with a config.h file, and go with @vthib's proposal.

Its your call, I removed the config.h commit and went with @vthib's proposal. I left the AC_DEFINE commit since it might save some debugging in the future in case autoconf ever becomes more pedantic about that, but I can remove it too if you wish?

plusvic commented 3 months ago

Fixed in 833a580430abe0fbc9bc17a21fb95bf36dacf367

orbea commented 3 months ago

Thank you! And sorry for forgetting about this PR, I had other things distracting me and lost track of it...