memorysafety / rav1d

An AV1 decoder in Rust.
BSD 2-Clause "Simplified" License
335 stars 22 forks source link

aarch64-unknown-linux-gnu/release/dav1d incorrectly lists and parses cpu architectures for x86_64 #803

Closed negge closed 7 months ago

negge commented 7 months ago

Building for aarch64 with the command cargo build --release --target aarch64-unknown-linux-gnu as described in https://github.com/memorysafety/rav1d/issues/773#issuecomment-1971788036 correctly produces a 64-bit dav1d binary.

negge@arm1:~/git/rav1d# file target/aarch64-unknown-linux-gnu/release/dav1d
target/aarch64-unknown-linux-gnu/release/dav1d: ELF 64-bit LSB executable, ARM aarch64, version 1 (GNU/Linux), statically linked, BuildID[sha1]=bc36ccb2760f4280851164dc8a5feddad1553535, for GNU/Linux 3.7.0, with debug_info, not stripped

However, looking at the help, it incorrectly shows cpumask specializations for x86_64 architectures.

negge@arm1:~/git/rav1d# target/aarch64-unknown-linux-gnu/release/dav1d --help
target/aarch64-unknown-linux-gnu/release/dav1d: unrecognized option '--help'
Usage: target/aarch64-unknown-linux-gnu/release/dav1d [options]

Supported options:
 --input/-i $file:     input file
 --output/-o $file:    output file (%n, %w or %h will be filled in for per-frame files)
 --demuxer $name:      force demuxer type ('ivf', 'section5' or 'annexb'; default: detect from content)
 --muxer $name:        force muxer type ('md5', 'yuv', 'yuv4mpeg2' or 'null'; default: detect from extension)
                       use 'frame' as prefix to write per-frame files; if filename contains %n, will default to writing per-frame files
 --quiet/-q:           disable status messages
 --frametimes $file:   dump frame times to file
 --limit/-l $num:      stop decoding after $num frames
 --skip/-s $num:       skip decoding of the first $num frames
 --realtime [$fract]:  limit framerate, optional argument to override input framerate
 --realtimecache $num: set the size of the cache in realtime mode (default: 0)
 --version/-v:         print version and exit
 --threads $num:       number of threads (default: 0)
 --framedelay $num:    maximum frame delay, capped at $threads (default: 0);
                       set to 1 for low-latency decoding
 --filmgrain $num:     enable film grain application (default: 1, except if muxer is md5 or xxh3)
 --oppoint $num:       select an operating point of a scalable AV1 bitstream (0 - 31)
 --alllayers $num:     output all spatial layers of a scalable AV1 bitstream (default: 1)
 --sizelimit $num:     stop decoding if the frame size exceeds the specified limit
 --strict $num:        whether to abort decoding on standard compliance violations
                       that don't affect bitstream decoding (default: 1)
 --verify $md5:        verify decoded md5. implies --muxer md5, no output
 --cpumask $mask:      restrict permitted CPU instruction sets (0, 'sse2', 'ssse3', 'sse41', 'avx2' or 'avx512icl'; default: -1)
 --negstride:          use negative picture strides
                       this is mostly meant as a developer option
 --outputinvisible $num: whether to output invisible (alt-ref) frames (default: 0)
 --inloopfilters $str: which in-loop filters to enable (none, (no)deblock, (no)cdef, (no)restoration or all; default: all)
 --decodeframetype $str: which frame types to decode (reference, intra, key or all; default: all)

Passing --cpumask avx512icl works fine (and apparently uses the neon optimizations), but --cpumask neon fails with this error:

negge@arm1:~/git/rav1d# ./target/aarch64-unknown-linux-gnu/release/dav1d --cpumask neon -i /root/Videos/Chimera/Chimera-AV1-8bit-1920x1080-6736kbps.ivf -o /dev/null
Invalid argument "neon" for option --cpumask; should be any of sse2, ssse3, sse41, avx2, avx512icl or none, a hexadecimal (starting with 0x), or an integer
kkysen commented 7 months ago

Hi, thanks for finding this! This is probably something we forgot to transpile twice/merged wrong. We can fix it soon. We have been meaning to just switch back to the C version for the binary tools (where this code is), as the important part is the rav1d library and this would help check that we're keeping the ABI the same.