rust-lang / miri

An interpreter for Rust's mid-level intermediate representation
Apache License 2.0
4.23k stars 326 forks source link

Shim wishlist #2057

Open saethlin opened 2 years ago

saethlin commented 2 years ago

I run Miri on the most-downloaded crates, with the flags -Zmiri-panic-on-unsupported and --no-fail-fast which basically asks Miri to try to run as many tests as it can. Searching all my build logs, I find these are the most-encountered can't call foreign function: errors:

The original post only contained ~10,000 crates, here's a list based on ~30,000

49787 counts
(  1)    16228 (32.59%,32.59%): epoll_create1
(  2)     5281 (10.61%,43.20%): pipe2
(  3)     2024 (4.07%,47.27%): socket
(  4)     1430 (2.87%,50.14%): localtime_r
(  5)     1340 (2.69%,52.83%): realpath
(  6)      726 (1.46%,54.29%): TLS_method
(  7)      693 (1.39%,55.68%): getaddrinfo
(  8)      584 (1.17%,56.85%): gnu_get_libc_version
(  9)      488 (0.98%,57.83%): ruby_init
( 10)      336 (0.67%,58.51%): mdb_env_create
( 11)      332 (0.67%,59.18%): mmap
( 12)      330 (0.66%,59.84%): GEOS_init_r
( 13)      320 (0.64%,60.48%): getrlimit
( 14)      317 (0.64%,61.12%): sodium_init
( 15)      311 (0.62%,61.74%): dlopen
( 16)      303 (0.61%,62.35%): flock
( 17)      302 (0.61%,62.96%): getpid
( 18)      292 (0.59%,63.54%): cuInit
( 19)      272 (0.55%,64.09%): gst_init_check
( 20)      271 (0.54%,64.64%): ts_parser_new
( 21)      268 (0.54%,65.17%): llvm.x86.subborrow.64
( 22)      259 (0.52%,65.69%): sqlite3_threadsafe
( 23)      245 (0.49%,66.19%): Py_IsInitialized
( 24)      239 (0.48%,66.67%): YGConfigNew
( 25)      235 (0.47%,67.14%): sigemptyset
( 26)      226 (0.45%,67.59%): duckdb_open_ext
( 27)      204 (0.41%,68.00%): ion_reader_open_buffer
( 28)      199 (0.40%,68.40%): mrb_open
( 29)      196 (0.39%,68.80%): lua_newstate
( 30)      194 (0.39%,69.18%): GFp_cpuid_setup
( 31)      176 (0.35%,69.54%): OPENSSL_init_ssl
( 32)      175 (0.35%,69.89%): sqlite3_open
( 33)      174 (0.35%,70.24%): fstat64
( 34)      174 (0.35%,70.59%): socketpair
( 35)      168 (0.34%,70.93%): eventfd
( 36)      167 (0.34%,71.26%): PQconnectdb
( 37)      167 (0.34%,71.60%): blst_scalar_from_uint64
( 38)      159 (0.32%,71.92%): JS_NewRuntime
( 39)      152 (0.31%,72.22%): blake3_compress_in_place_sse2
( 40)      150 (0.30%,72.52%): curl_global_init
( 41)      150 (0.30%,72.82%): memcpy
( 42)      149 (0.30%,73.12%): getuid
( 43)      145 (0.29%,73.41%): rustsecp256k1_v0_4_1_context_preallocated_size
( 44)      134 (0.27%,73.68%): ditto_auth_client_make_for_development
( 45)      131 (0.26%,73.95%): js_newstate
( 46)      130 (0.26%,74.21%): Rf_initialize_R
( 47)      124 (0.25%,74.46%): secp256k1_context_create
( 48)      109 (0.22%,74.68%): mkstemp
( 49)      104 (0.21%,74.89%): NK_login_enum
( 50)      103 (0.21%,75.09%): log1p

There are some pretty obvious biases in this data. For example, crates with a lot of tests get their favorite symbols counted more than crates with fewer tests.

pipe2 and gnu_get_libc_version are from tests that try to start a std::process::Command. pipe2 is from .output() and gnu_get_libc_version is from spawn() I think.

I think __xpg_strerror_r is from crates that encounter a std::io::Error and try to format it. This is pretty common because a lot of crates attempt to access files in their tests that are not packaged (this is also a huge problem for crater).

There are also a lot of things in here which I don't think we will ever support via shims. We're not going to have shims for openssl. We're also not going to shim CUDA or JavaScript, and also all those errors are from 3 crates using CUDA and 1 crate doing some JavaScript thing.

But in any case, this is a starting point for prioritizing what shims should be added. Just in case there isn't anyone out there who already has a pet function they want added.

@DrMeepster asked me to run some crates with a Windows target. Here's what the first ~2k crates run into:

2911 counts
(  1)     1583 (54.4%, 54.4%):  GetFullPathNameW
(  2)      485 (16.7%, 71.0%):  CreateIoCompletionPort
(  3)      292 (10.0%, 81.1%):  GetFileInformationByHandleEx
(  4)      183 ( 6.3%, 87.4%):  GetModuleFileNameW
(  5)       46 ( 1.6%, 88.9%):  SleepConditionVariableSRW
(  6)       24 ( 0.8%, 89.8%):  GetTickCount64
(  7)       22 ( 0.8%, 90.5%):  CompareStringOrdinal
(  8)       18 ( 0.6%, 91.1%):  AcquireCredentialsHandleA
(  9)       16 ( 0.5%, 91.7%):  WSAStartup
( 10)       15 ( 0.5%, 92.2%):  CreateNamedPipeW
( 11)       14 ( 0.5%, 92.7%):  Sleep
( 12)       13 ( 0.4%, 93.1%):  CryptAcquireContextW
( 13)       13 ( 0.4%, 93.6%):  log1p
( 14)       11 ( 0.4%, 94.0%):  CreateSemaphoreA
( 15)       10 ( 0.3%, 94.3%):  CreatePipe
( 16)       10 ( 0.3%, 94.6%):  SHGetKnownFolderPath
( 17)       10 ( 0.3%, 95.0%):  WakeAllConditionVariable
( 18)       10 ( 0.3%, 95.3%):  zx_vmo_create
( 19)        8 ( 0.3%, 95.6%):  GetCurrentProcess

I don't know what GetFullPathNameW is coming from, but when crates hit it, they usually have tends to hundreds of tests that attempt to use it. Possibly it's from proptest? prost is one of the crates that hits it a bunch of times.

RalfJung commented 2 years ago

__xpg_strerror_r is something that also comes up when just debugging Miri, I think even some of the tests will debug-print the IO error when the test fails, leading to an annoying double-failure (first the test fails with an assertion, then printing that assertion fails with an "unsupported" error).

RalfJung commented 2 years ago

I think __xpg_strerror_r is from crates that encounter a std::io::Error and try to format it. This is pretty common because a lot of crates attempt to access files in their tests that are not packaged (this is also a huge problem for crater).

Wish granted. ;) https://github.com/rust-lang/miri/pull/2067

Just N-1 to go...

mejrs commented 2 years ago

( 10) 205 ( 1.3%, 52.8%): Py_IsInitialized

This the first CPython api call that Rust programs embedding the python interpreter will do. I would really like to test pyo3 and these programs with Miri.

I know Python's C api is too big to shim, but I would love to see some solution for this.

CraftSpider commented 2 years ago

I have a similar wishlist for testing a PHP extension library. Currently it fails at pipe2, but after that it would start failing at whatever the first call to PHP is. It would be nice if there was some way to use miri to test the Rust parts of my project, and then just actually call out to the relevant PHP stuff as normal.

jjl commented 2 years ago

mmap() is the one i'm currently missing. for my purposes it could just use a boxed slice. would be a little painful on memory, but enough to get miri running.

bjorn3 commented 2 years ago

Read-only mmap where you don't write to the backing file shouldn't be too hard to implement. If writes through either the mapping or through the backing file are allowed, it becomes a lot harder to implement unix compatible semantics.

jjl commented 2 years ago

Ah, I see what you mean. Fortunately, I'm only using anonymous mappings.

LegNeato commented 2 years ago

Thanks for the list! I picked the easiest one I could see near the top (realpath):

Is there any way we can get a couple of example crates for each missing shim in the list so implementers can check the before and after with real code?

saethlin commented 2 years ago

I've manually grabbed a list for you for realpath: realpath-crates.txt

I have a very simple website which is just Cloudfront + S3 at https://miri.saethlin.dev (most people care about https://miri.saethlin.dev/ub). I'm not opposed to making simple additions, like a way to download all the raw logs in a .tar.gz. You can comment here or open a new issue: https://github.com/saethlin/miri-tools/issues/8

saethlin commented 10 months ago

Since @eduardosm has been doing some amazing work on implementing LLVM x86 intrinsics, here's my list of those that are encountered currently in test suites:

(  1)     6867 (74.6%, 74.6%): llvm.x86.subborrow.64
(  2)      539 ( 5.9%, 80.5%): llvm.x86.sse2.psad.bw
(  3)      361 ( 3.9%, 84.4%): llvm.x86.sse.cmp.ps
(  4)      177 ( 1.9%, 86.3%): llvm.x86.sse2.cmp.pd
(  5)      137 ( 1.5%, 87.8%): llvm.x86.sse.add.ss
(  6)       93 ( 1.0%, 88.8%): llvm.x86.aesni.aesenc
(  7)       87 ( 0.9%, 89.8%): llvm.x86.sse.sqrt.ss
(  8)       86 ( 0.9%, 90.7%): llvm.x86.rdtsc
(  9)       74 ( 0.8%, 91.5%): llvm.prefetch
( 10)       64 ( 0.7%, 92.2%): llvm.x86.bmi.pdep.64
( 11)       60 ( 0.7%, 92.9%): llvm.x86.sse2.pmulu.dq
( 12)       57 ( 0.6%, 93.5%): llvm.x86.sse2.pslli.d
( 13)       54 ( 0.6%, 94.1%): llvm.x86.sse.sqrt.ps
( 14)       40 ( 0.4%, 94.5%): llvm.x86.sse.movmsk.ps
( 15)       36 ( 0.4%, 94.9%): llvm.x86.sse.min.ps
( 16)       33 ( 0.4%, 95.3%): llvm.x86.sse.max.ps
( 17)       26 ( 0.3%, 95.6%): llvm.x86.sse2.cvttps2dq
( 18)       24 ( 0.3%, 95.8%): llvm.x86.sse.cmp.ss
( 19)       24 ( 0.3%, 96.1%): llvm.x86.sse2.cmp.sd
( 20)       22 ( 0.2%, 96.3%): llvm.x86.sse2.psll.q
( 21)       19 ( 0.2%, 96.5%): llvm.x86.sse2.psll.d
( 22)       18 ( 0.2%, 96.7%): llvm.x86.sse2.cvtps2dq
( 23)       17 ( 0.2%, 96.9%): llvm.x86.pclmulqdq
( 24)       17 ( 0.2%, 97.1%): llvm.x86.sse2.psrl.d
( 25)       16 ( 0.2%, 97.3%): llvm.x86.sse2.psrl.q
( 26)       15 ( 0.2%, 97.4%): llvm.x86.sse2.storeu.dq
( 27)       10 ( 0.1%, 97.5%): llvm.x86.sse.stmxcsr
( 28)       10 ( 0.1%, 97.6%): llvm.x86.sse2.pmulh.w
( 29)       10 ( 0.1%, 97.8%): llvm.x86.sse2.psll.w
( 30)       10 ( 0.1%, 97.9%): llvm.x86.sse2.pslli.w
( 31)        9 ( 0.1%, 98.0%): llvm.x86.sse2.movmsk.pd
( 32)        8 ( 0.1%, 98.0%): llvm.x86.sse2.psra.d
( 33)        6 ( 0.1%, 98.1%): llvm.x86.sse.rcp.ps
( 34)        6 ( 0.1%, 98.2%): llvm.x86.sse.rsqrt.ps
( 35)        6 ( 0.1%, 98.2%): llvm.x86.sse2.cvtdq2ps
( 36)        6 ( 0.1%, 98.3%): llvm.x86.sse2.max.pd
( 37)        6 ( 0.1%, 98.4%): llvm.x86.sse2.min.pd
( 38)        6 ( 0.1%, 98.4%): llvm.x86.sse2.psra.w
( 39)        6 ( 0.1%, 98.5%): llvm.x86.sse2.psrli.d
( 40)        5 ( 0.1%, 98.6%): llvm.x86.sse2.packsswb.128
( 41)        5 ( 0.1%, 98.6%): llvm.x86.sse2.packuswb.128
( 42)        5 ( 0.1%, 98.7%): llvm.x86.ssse3.pshuf.b.128
( 43)        4 ( 0.0%, 98.7%): llvm.x86.sse.sub.ss
( 44)        4 ( 0.0%, 98.7%): llvm.x86.sse2.comige.sd
( 45)        4 ( 0.0%, 98.8%): llvm.x86.sse2.psrl.w
( 46)        4 ( 0.0%, 98.8%): llvm.x86.sse3.ldu.dq
( 47)        3 ( 0.0%, 98.9%): llvm.x86.sse2.packssdw.128
( 48)        3 ( 0.0%, 98.9%): llvm.x86.sse2.psrai.d
( 49)        3 ( 0.0%, 98.9%): llvm.x86.sse2.psrai.w
( 50)        3 ( 0.0%, 99.0%): llvm.x86.sse2.psrli.q
( 51)        2 ( 0.0%, 99.0%): llvm.ctpop.v16i8
( 52)        2 ( 0.0%, 99.0%): llvm.x86.avx2.pslli.d
( 53)        2 ( 0.0%, 99.0%): llvm.x86.sse.comieq.ss
( 54)        2 ( 0.0%, 99.1%): llvm.x86.sse.comige.ss
( 55)        2 ( 0.0%, 99.1%): llvm.x86.sse.comigt.ss
( 56)        2 ( 0.0%, 99.1%): llvm.x86.sse.comile.ss
( 57)        2 ( 0.0%, 99.1%): llvm.x86.sse.comilt.ss
( 58)        2 ( 0.0%, 99.1%): llvm.x86.sse.comineq.ss
( 59)        2 ( 0.0%, 99.2%): llvm.x86.sse.cvtsi2ss
( 60)        2 ( 0.0%, 99.2%): llvm.x86.sse.cvtss2si
( 61)        2 ( 0.0%, 99.2%): llvm.x86.sse.div.ss
( 62)        2 ( 0.0%, 99.2%): llvm.x86.sse.max.ss
( 63)        2 ( 0.0%, 99.2%): llvm.x86.sse.min.ss
( 64)        2 ( 0.0%, 99.3%): llvm.x86.sse.mul.ss
( 65)        2 ( 0.0%, 99.3%): llvm.x86.sse.rcp.ss
( 66)        2 ( 0.0%, 99.3%): llvm.x86.sse.rsqrt.ss
( 67)        2 ( 0.0%, 99.3%): llvm.x86.sse2.comieq.sd
( 68)        2 ( 0.0%, 99.4%): llvm.x86.sse2.comile.sd
( 69)        2 ( 0.0%, 99.4%): llvm.x86.sse2.comilt.sd
( 70)        2 ( 0.0%, 99.4%): llvm.x86.sse2.comineq.sd
( 71)        2 ( 0.0%, 99.4%): llvm.x86.sse2.cvtpd2dq
( 72)        2 ( 0.0%, 99.4%): llvm.x86.sse2.cvtpd2ps
( 73)        2 ( 0.0%, 99.5%): llvm.x86.sse2.cvtps2pd
( 74)        2 ( 0.0%, 99.5%): llvm.x86.sse2.cvtsd2si
( 75)        2 ( 0.0%, 99.5%): llvm.x86.sse2.cvtsd2si64
( 76)        2 ( 0.0%, 99.5%): llvm.x86.sse2.cvtsd2ss
( 77)        2 ( 0.0%, 99.6%): llvm.x86.sse2.cvtss2sd
( 78)        2 ( 0.0%, 99.6%): llvm.x86.sse2.cvttpd2dq
( 79)        2 ( 0.0%, 99.6%): llvm.x86.sse2.cvttsd2si
( 80)        2 ( 0.0%, 99.6%): llvm.x86.sse2.cvttsd2si64
( 81)        2 ( 0.0%, 99.6%): llvm.x86.sse2.max.sd
( 82)        2 ( 0.0%, 99.7%): llvm.x86.sse2.min.sd
( 83)        2 ( 0.0%, 99.7%): llvm.x86.sse2.pavg.b
( 84)        2 ( 0.0%, 99.7%): llvm.x86.sse2.pavg.w
( 85)        2 ( 0.0%, 99.7%): llvm.x86.sse2.pmadd.wd
( 86)        2 ( 0.0%, 99.7%): llvm.x86.sse2.pmulhu.w
( 87)        2 ( 0.0%, 99.8%): llvm.x86.sse2.pslli.q
( 88)        2 ( 0.0%, 99.8%): llvm.x86.sse2.psrli.w
( 89)        2 ( 0.0%, 99.8%): llvm.x86.sse2.sqrt.sd
( 90)        2 ( 0.0%, 99.8%): llvm.x86.sse2.storeu.pd
( 91)        2 ( 0.0%, 99.9%): llvm.x86.sse3.hadd.ps
( 92)        2 ( 0.0%, 99.9%): llvm.x86.sse41.dpps
( 93)        2 ( 0.0%, 99.9%): llvm.x86.sse41.ptestz
( 94)        1 ( 0.0%, 99.9%): llvm.ctpop.v2i64
( 95)        1 ( 0.0%, 99.9%): llvm.ctpop.v4i32
( 96)        1 ( 0.0%, 99.9%): llvm.ctpop.v8i16
( 97)        1 ( 0.0%, 99.9%): llvm.x86.avx2.permd
( 98)        1 ( 0.0%,100.0%): llvm.x86.avx2.pslli.q
( 99)        1 ( 0.0%,100.0%): llvm.x86.bmi.bzhi.64
(100)        1 ( 0.0%,100.0%): llvm.x86.fma.vfnmadd.pd.256
(101)        1 ( 0.0%,100.0%): llvm.x86.rdrand.64
(102)        1 ( 0.0%,100.0%): llvm.x86.rdseed.64
saethlin commented 10 months ago

Here's an update:

(  1)      214 (40.9%, 40.9%): llvm.x86.sse41.pblendw
(  2)      128 (24.5%, 65.4%): llvm.x86.rdtsc
(  3)       37 ( 7.1%, 72.5%): llvm.x86.sse41.blendvps
(  4)       22 ( 4.2%, 76.7%): llvm.x86.sse41.ptestz
(  5)       19 ( 3.6%, 80.3%): llvm.x86.sse42.crc32.32.8
(  6)       17 ( 3.3%, 83.6%): llvm.x86.sse42.pcmpestri128
(  7)       12 ( 2.3%, 85.9%): llvm.x86.sse.stmxcsr
(  8)       10 ( 1.9%, 87.8%): llvm.x86.sse41.round.ps
(  9)        7 ( 1.3%, 89.1%): llvm.x86.sse41.packusdw
( 10)        6 ( 1.1%, 90.2%): llvm.x86.sse41.round.sd
( 11)        6 ( 1.1%, 91.4%): llvm.x86.sse41.round.ss
( 12)        5 ( 1.0%, 92.4%): llvm.x86.sse41.dpps
( 13)        4 ( 0.8%, 93.1%): llvm.x86.sse41.insertps
( 14)        3 ( 0.6%, 93.7%): llvm.x86.sse41.mpsadbw
( 15)        3 ( 0.6%, 94.3%): llvm.x86.sse41.phminposuw
( 16)        3 ( 0.6%, 94.8%): llvm.x86.sse41.pmuldq
( 17)        3 ( 0.6%, 95.4%): llvm.x86.sse41.ptestc
( 18)        3 ( 0.6%, 96.0%): llvm.x86.sse41.ptestnzc
( 19)        3 ( 0.6%, 96.6%): llvm.x86.sse41.round.pd
( 20)        3 ( 0.6%, 97.1%): llvm.x86.sse42.crc32.32.16
( 21)        3 ( 0.6%, 97.7%): llvm.x86.sse42.crc32.32.32
( 22)        3 ( 0.6%, 98.3%): llvm.x86.sse42.crc32.64.64
( 23)        3 ( 0.6%, 98.9%): llvm.x86.sse42.pcmpestrm128
( 24)        3 ( 0.6%, 99.4%): llvm.x86.sse42.pcmpistri128
( 25)        2 ( 0.4%, 99.8%): llvm.experimental.vector.reduce.add.v4i128
( 26)        1 ( 0.2%,100.0%): llvm.x86.sse41.blendvpd

I do not know what happened to the aesni.aesenc calls. My best guess is that I changed to -Ctarget-cpu=x86_64-v2 and now they're hidden behind some SSE4 functions.

def- commented 9 months ago

Thanks for posting this. We use miri in Materialize and seem to be having a different set of unsupported operations, basically our wishlist:

$ git grep "cfg_attr(miri"|sed -e "s#.*\/\/ *##" | sed -e "s/error: *//" | grep -v "slow"|sort|uniq -c|sort -n -r
     81 unsupported operation: can't call foreign function `TLS_client_method` on OS `linux`
     58 unsupported operation: returning ready events from epoll_wait is not yet implemented
     20 unsupported operation: can't call foreign function `decNumberFromInt32` on OS `linux`
     15 unsupported operation: can't call foreign function `decContextDefault` on OS `linux`
     12 unsupported operation: can't call foreign function `rust_psm_stack_pointer` on OS `linux`
     12 unsupported operation: cannot write to event
     11 unsupported operation: can't call foreign function `OPENSSL_init_ssl` on OS `linux`
      7 unsupported operation: integer-to-pointer casts and `ptr::from_exposed_addr` are not supported with `-Zmiri-strict-provenance`
      6 unsupported operation: inline assembly is not supported
      6 unsupported operation: can't call foreign function `TLS_method` on OS `linux`
      4 unsupported operation: can't call foreign function `socket` on OS `linux`
      4 unsupported operation: can't call foreign function `pipe2` on OS `linux`
      3 unsupported operation: can't call foreign function `rocksdb_create_default_env` on OS `linux`
      3 unsupported operation: can't call foreign function `epoll_wait` on OS `linux`
      2 unsupported operation: can't call foreign function `deflateInit2_` on OS `linux`
      1 unsupported operation: non-default mode 0o600 is not supported
      1 unsupported operation: integer-to-pointer casts and `ptr::from_exposed_addr`
      1 unsupported operation: can't call foreign function `pidfile_open` on OS `linux`
newpavlov commented 3 weeks ago

My list of desired shims: O_NOFOLLOW flag support for open, flock with all 3 flags, pwrite64, and pread64. It's not a full list, just functions on which our tests are stuck (we can cfg our way out for the former two, but not for the latter two).