georgeamccarthy / protein_search

The neural search engine for proteins.
GNU Affero General Public License v3.0
15 stars 6 forks source link

Computing embeddings during indexing takes exceptionally long on Apple Silicon #27

Closed georgeamccarthy closed 3 years ago

georgeamccarthy commented 3 years ago

Describe the bug In previous versions (not sure which) of protein_search, the indexing stage was short (~<10 seconds). Now computing the embeddings:

outputs = self.model(**encoded_inputs)

takes a 'long' time (~>1 minute) on my computer (MacBook Air 2020 M1, Big Sur 11.4)

To Reproduce Steps to reproduce the behavior:

  1. Delete embeddings/proteins.json if it exists
  2. Activate environment
  3. Run python backend/app.py
  4. Observe long time

Expected behavior It should be faster, I recall that it has been in previous versions.

Desktop (please complete the following information):

georgeamccarthy commented 3 years ago

Restarted system → no change.

georgeamccarthy commented 3 years ago

I think the slow performance of the powerful M1 processor is likely due to python being run through Rosetta 2. I checked my running processes and the anaconda dist I'm using is x86-64 running through Rosetta 2.

I'll mess around with trying to use the experimental ARM versions but otherwise this issue is closed as it's simply due to my setup.

See https://www.anaconda.com/blog/apple-silicon-transition for more info

Edit: trying to use the ARM versions broke everything so just gonna settle for slow for now.