RazrFalcon / tiny-skia

A tiny Skia subset ported to Rust
BSD 3-Clause "New" or "Revised" License
1.05k stars 67 forks source link

Implement Neon SIMD #39

Closed CryZe closed 1 year ago

CryZe commented 2 years ago

This adds support for the NEON SIMD instructions. They are still nightly only and some instructions are missing, but they are close to finishing implementing all of them, so stabilization shouldn't be too far off. There's a few semantic differences between the different targets which leaves some open questions in the meantime.

RazrFalcon commented 2 years ago

No way to test ARM code on CI. This is the real bummer.

jrmuizel commented 2 years ago

stdarch tests ARM code on CI: https://github.com/rust-lang/stdarch/blob/master/.github/workflows/main.yml

RazrFalcon commented 2 years ago

@jrmuizel They are using docker. Seems like overly complicated solution. Looks like Circle CI provides ARM VMs.

CryZe commented 2 years ago

The cross tool is trivial to set up in CI and all you need to do is cross test instead of cargo test. It supports aarch64 and many other architectures (internally also uses docker and qemu).

RazrFalcon commented 2 years ago

Interesting. Would you mind including this into this PR?

Also, I would love to reduce unsafe usage, but looks like safe_arch doesn't support arm. Right now, tiny-skia doesn't rely on any unsafe code blocks and I want to keep it that way.

CryZe commented 2 years ago

Alright, so NEON intrinsics are about to stabilize in Rust 1.59, so I've started working on this again.

Also, I would love to reduce unsafe usage, but looks like safe_arch doesn't support arm. Right now, tiny-skia doesn't rely on any unsafe code blocks and I want to keep it that way.

I've written a PR against bytemuck that allows removing the unsafe transmutes for the NEON vectors. It already got merged. I also ported all the NEON intrinsics into safe functions for merging into safe_arch. I'll open that PR soon.

A thing I've noticed though is that this will almost definitely require 1.59 as the minimum Rust version on at least AArch64, unless we introduce some kind of extra feature that gates AArch64 from using SIMD on earlier Rust versions.

Also since the const generics are unavoidable, I think we need at least 1.51 as the new minimum Rust version (the new bytemuck and safe_arch will require this too, at least according to tiny-skia's Cargo.toml).

RazrFalcon commented 2 years ago

I see. I guess MSRV bump is unavoidable. But I would to keep ARM feature-gated or something. Or maybe we could specify two MSRV for each target? Could we be able to build x86 on older version still?

Also, thanks a lot for you work. I really don't have time lately and those changes are far from trivial.

RazrFalcon commented 2 years ago

Could you also update the Readme? Currently, it mentions that ARM is not supported in the Performance section. Thanks.

RazrFalcon commented 2 years ago

Hi! Any updates on this?

CryZe commented 2 years ago

Not yet, I have a WIP branch for safe arch, but I haven't opened a PR for it yet. I'll probably revisit this soon.

Brooooooklyn commented 1 year ago

No way to test ARM code on CI. This is the real bummer.

We can test it via multiarch/ubuntu-core:aarch64-focal docker image, which is qemu under the hood, and it supports all neon instructions: https://github.com/Brooooooklyn/canvas/runs/7711666646?check_suite_focus=true

RazrFalcon commented 1 year ago

@CryZe I've dropped safe_arch support in favor of explicit SIMD. This should make this patch way easier.

CryZe commented 1 year ago

Well that was easy :D (didn't expect the CI to work first try, especially considering I didn't even run the tests locally).

RazrFalcon commented 1 year ago

Thanks for the patch! I took a while to merge (10 months...), but we're finally here. Hopefully I will publish a new release this weekends.

RazrFalcon commented 1 year ago

Almost 3x faster on Apple M1. Amazing!

Shnatsel commented 1 year ago

Excellent! I think the README needs an update, as it still reads:

Skia also supports ARM NEON instructions, which are unavailable in a stable Rust at the moment. Therefore a fallback scalar implementation will be used instead on ARM and other non-x86 targets. So if you're targeting ARM, you better stick with Skia.

RazrFalcon commented 1 year ago

Yes, thanks!