Make qsimcirq Apple Silicon (M1, M2, ARM64) compatible and improve build scripts

basnijholt commented 1 year ago

I've been working on compiling qsimcirq for conda-forge (https://github.com/conda-forge/staged-recipes/pull/24504) and encountered several challenges with the existing CMake scripts, particularly for Linux and MacOS x86 platforms.

This PR introduces modifications that ensure successful builds on these platforms. Additionally, I've extended support to Apple Silicon (ARM architecture), focusing on compiling qsim_basic due to the unavailability of most CPU instruction sets on ARM.

This inclusion is particularly beneficial for developers who use MacBook M1 laptops, as it enables them to write and test code on their local machines and seamlessly run the same code on more powerful computational clusters.

Looking ahead, I plan to submit another PR to improve CUDA builds, further enhancing qsimcirq's performance and compatibility.

I kindly request a review from @95-martin-orion, @jakurzak, and @sergeisakov, as your insights would be invaluable. Your feedback will help ensure that these changes align well with qsim's development goals.

This PR aims to resolve the following issues:

Closes #242
Closes #495
Closes #597

Thank you for considering these changes!

95-martin-orion commented 1 year ago

Thanks for your contribution @basnijholt !

A couple requests regarding tests and release:

This PR fails the current (non-Silicon) MacOS tests (logs). These are defined in this Github Actions workflow, which will need to pass before this PR can merge.
Our Github Actions workflows for testing and releasing wheels don't currently include Apple Silicon wheels. These are now available for use - if you can add them, users on Apple Silicon will be able to pip install qsimcirq instead of having to build the wheels locally.

And a final word of caution: as you may have already noticed, qsim_basic is considerably slower than other qsim modes due to the lack of vectorization. For the use case you describe (testing locally, then running the same code on non-ARM64 machines) this shouldn't be an issue, but other use cases may be adversely impacted. In those cases, I still recommend the other options listed here - in particular, Google Colab is a highly portable solution for < 28 qubits.

basnijholt commented 12 months ago

@95-martin-orion, thanks for your review. I will address your points.

~~In the meantime, could you enable workflow runs for any of my pushes? Right now it is pending approval on every commit.~~ edit: I enabled them on my own fork: https://github.com/basnijholt/qsim/pull/1

Seems related:

basnijholt commented 12 months ago

Awesome! @95-martin-orion @sergeisakov, on my fork the CI passes and builds Wheels for MacOS x86_64 and ARM64 successfully ✅:tada:

I would really appreciate if you could take another look. I believe I addressed all concerns.

95-martin-orion commented 12 months ago

Some recent Github outages broke the CI pipeline. Could you make a fresh commit? I'll trigger the necessary approvals.

basnijholt commented 12 months ago

@95-martin-orion, I just closed and opened the PR, which should retrigger the CI too.

basnijholt commented 12 months ago

Awesome! The CI pipelines have all passed :tada:

Anything else I can do? 😄

mpharrigan commented 12 months ago

do we have a tracking bug for actually supporting fast execution on apple silicon architectures? could be cool

95-martin-orion commented 12 months ago

do we have a tracking bug for actually supporting fast execution on apple silicon architectures? could be cool

I don't think we have plans to support this. Apple Silicon (and ARM64 in general) doesn't have the SSE or AVX instruction sets, so supporting them would essentially require a total rewrite of qsim from the bottom up.

basnijholt commented 12 months ago

If I find the time and motivation I could look in writing a Metal implementation.

kgantchev commented 10 months ago

do we have a tracking bug for actually supporting fast execution on apple silicon architectures? could be cool

Might I make a suggestion? You can try FlyCI's M1 and M2 runners. Our runners are on average 2x faster and 2x cheaper than GitHub's AND we have a free tier for OSS projects (see below).

Install Instructrions

Easily replace your M1 runners:

jobs:
 ci:
-    runs-on: macos-latest
+    runs-on: flyci-macos-large-latest-m1
   steps:
   - name: 👀 Checkout repo
     uses: actions/checkout@v4

Or try the M2 runners:

jobs:
  ci:
-    runs-on: macos-latest
+    runs-on: flyci-macos-large-latest-m2
    steps:
      - name: 👀 Checkout repo
        uses: actions/checkout@v4

Pricing

Processor	vCPU	RAM (GB)	Storage	Label	Price on FlyCI	Price on GitHub
M1	4	7	28 GB	flyci-macos-large-latest-m1	$0.06	-
M1	8	14	28 GB	flyci-macos-xlarge-latest-m1	$0.12	$0.16
M2	4	7	28 GB	flyci-macos-large-latest-m2	$0.08	-
M2	8	14	28 GB	flyci-macos-xlarge-latest-m2	$0.16	-

500 mins/month Free for Public Repos

If your repo is public, then FlyCI offers 500 mins/month of free M1 runner usage with the flyci-macos-large-latest-m1 runner.

Best Regards, Kiril Gantchev CEO and co-founder of FlyCI

quantumlib / qsim