SysBioChalmers / GECKO

Toolbox for including enzyme constraints on a genome-scale model.
http://sysbiochalmers.github.io/GECKO/
MIT License
66 stars 51 forks source link

DLKcat on Mac with Apple Silicon (arm64 architecture) #393

Open Mengzhensw opened 1 month ago

Mengzhensw commented 1 month ago

Hi, I’m using a Mac to run GECKO and encountered the following error when executing runDLKcat():WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested

To address this, I modified the Docker command in runDLKcat.m to: status = system(['docker run --platform linux/amd64 --rm -v "' fullfile(params.path,'/data') '":/data ghcr.io/sysbiochalmers/dlkcat-gecko:0.1 /bin/bash -c "python DLKcat.py /data/DLKcat.tsv /data/DLKcatOutput.tsv"']);

However, this results in an extremely long runtime—more than 20 hours for processing a single sample. Are there any recommendations or solutions to improve the performance? Thanks!

edkerk commented 1 month ago

I don't have access to a Mac with Apple Silicon CPU, but I imagine that this might be where the problem lies. @simas232 have you run DLKcat on Apple Silicon CPU?

mihai-sysbio commented 1 month ago

You can use servbay instead of docker, which is better than docker on mac

Had a quick glance at that, it seems servbay is meant for a different purpose.

That being said, the root cause has been correctly identified - the Docker imagine would need to be updated for wider support.

mihai-sysbio commented 1 month ago

I gave this a go by running docker buildx build --platform linux/amd64,linux/arm64/v8 . and encountered:

=> ERROR [linux/arm64 3/3] RUN pip install --no-cache-dir -r requirements.txt torch@https://download.pytorch.org/whl/cpu/torch-1.9.1%2Bcpu-cp39-cp39-linux_x86_64.whl

There is an error while installing the dependencies from requirements.txt for the arm64 platform, so it looks like this needs more investigation and it won't be 5 min job. Anyone wanting to give this a go?

simas232 commented 1 month ago

I don't have access to a Mac with Apple Silicon CPU, but I imagine that this might be where the problem lies. @simas232 have you run DLKcat on Apple Silicon CPU?

DLKcat works fine on MacBook M1

edkerk commented 1 month ago

You could otherwise try to make your own Docker with the following code. In src/dlkcat-gecko/, change Dockerfile to:

FROM python:3.9-slim

LABEL org.opencontainers.image.source=https://github.com/sysbiochalmers/gecko
LABEL version="0.2-arm"
LABEL description="Custom Docker image of SysBioChalmers/DKLcat adapted for SysBioChalmers/GECKO version 3"

COPY . .
RUN pip install --no-cache-dir -r requirements.txt 
RUN pip install torch==1.9.0 --index-url https://download.pytorch.org/whl/cpu

and change requirements.txtto:

scikit-learn>=0.23.2
Biopython==1.78
rdkit-pypi
pandas
SciPy
NumPy<2

Then, while in dlkcat-gecko, run the following line in your Terminal:

docker buildx build -t ghcr.io/sysbiochalmers/dlkcat-gecko:0.2-arm --platform linux/arm64/v8 .

Finally, in runDLKcat.m line 52, you should refer to the right Docker image by mentioning the LABEL version= string that was specified in Dockerfile (note 0.2-arm):

status = system(['docker run --rm -v "' fullfile(params.path,'/data') '":/data ghcr.io/sysbiochalmers/dlkcat-gecko:0.2-arm /bin/bash -c "python DLKcat.py /data/tempDLKcat.tsv /data/tempDLKcatOutput.tsv"']);

If I tried this on a Windows 10 PC, the resulting image was less than 300 MB, which is substantially less than the 1.9 GB that the current Docker has. So I strongly doubt that my attempt worked. But maybe you have more successful by directly running this on a M1-4 chip.

Mengzhensw commented 1 month ago

You could otherwise try to make your own Docker with the following code. In src/dlkcat-gecko/, change Dockerfile to:

FROM python:3.9-slim

LABEL org.opencontainers.image.source=https://github.com/sysbiochalmers/gecko
LABEL version="0.2-arm"
LABEL description="Custom Docker image of SysBioChalmers/DKLcat adapted for SysBioChalmers/GECKO version 3"

COPY . .
RUN pip install --no-cache-dir -r requirements.txt 
RUN pip install torch==1.9.0 --index-url https://download.pytorch.org/whl/cpu

and change requirements.txtto:

scikit-learn>=0.23.2
Biopython==1.78
rdkit-pypi
pandas
SciPy
NumPy<2

Then, while in dlkcat-gecko, run the following line in your Terminal:

docker buildx build -t ghcr.io/sysbiochalmers/dlkcat-gecko:0.2-arm --platform linux/arm64/v8 .

Finally, in runDLKcat.m line 52, you should refer to the right Docker image by mentioning the LABEL version= string that was specified in Dockerfile (note 0.2-arm):

status = system(['docker run --rm -v "' fullfile(params.path,'/data') '":/data ghcr.io/sysbiochalmers/dlkcat-gecko:0.2-arm /bin/bash -c "python DLKcat.py /data/tempDLKcat.tsv /data/tempDLKcatOutput.tsv"']);

If I tried this on a Windows 10 PC, the resulting image was less than 300 MB, which is substantially less than the 1.9 GB that the current Docker has. So I strongly doubt that my attempt worked. But maybe you have more successful by directly running this on a M1-4 chip.

Hi, many thanks for your reply, I made the modifications accordingly, but encountered the following error when running runDLKcat():

Running DLKcat prediction, this may take many minutes, especially the first time.
Traceback (most recent call last):
  File "//DLKcat.py", line 26, in <module>
    fingerprint_dict = load_pickle('input/fingerprint_dict.pickle')
  File "//DLKcat.py", line 24, in load_pickle
    return pickle.load(f)
_pickle.UnpicklingError: invalid load key, 'v'.
Error using runDLKcat
DLKcat encountered an error or it did not create any output file.
edkerk commented 1 month ago

As I don't have a Mac with Apple Silicon CPU, I unfortunately cannot give further support. But from this earlier comment it appears that the original Docker image should work on at least M1 CPUs. Maybe you want to look into OrbStack? Note that you would then also want to change the runDLKcat function to use OrbStack instead. Again, I can give no support for this.

Mengzhensw commented 1 month ago

As I don't have a Mac with Apple Silicon CPU, I unfortunately cannot give further support. But from this earlier comment it appears that the original Docker image should work on at least M1 CPUs. Maybe you want to look into OrbStack? Note that you would then also want to change the runDLKcat function to use OrbStack instead. Again, I can give no support for this.

Thanks for the reply! I’ll give it a try and see how it goes.

mihai-sysbio commented 1 month ago

If I tried this on a Windows 10 PC, the resulting image was less than 300 MB, which is substantially less than the 1.9 GB that the current Docker has.

Thanks for trying this out @edkerk. I, too, find the size difference very surprising. In any case, your approach has triggered my curiosity, so I've managed to push a multiarch version of the image that was build for arm64 and amd64.

@Mengzhen-Li-sw it would be great if you could give it a try https://github.com/SysBioChalmers/GECKO/pkgs/container/dlkcat-gecko/291906062?tag=0.1-multiarch

Note: I haven't tested this at all - my M3 machine is not set up with Matlab.

edkerk commented 1 month ago

@mihai-sysbio It doesn't work on PC, but this is because it uses numpy 2. Please see my suggested changes to requirements.txt and Dockerfile, these are also required for amd86.

mihai-sysbio commented 1 month ago

Fantastic you caught that, I've applied the fix, see #395 . In the meantime, I've deleted the published 0.1-multiarch and a new one is being uploaded.

edit: upload finished, please test again

edkerk commented 1 month ago

It didn't work, I get the error message:

Traceback (most recent call last):
  File "//DLKcat.py", line 26, in <module>
    fingerprint_dict = load_pickle('input/fingerprint_dict.pickle')
  File "//DLKcat.py", line 24, in load_pickle
    return pickle.load(f)
_pickle.UnpicklingError: invalid load key, 'v'.

When I run with 0.1 instead of 0.1-multiarch there is no problem, so the solution I first thought I had found (#396) does not resolve this.

Mengzhensw commented 1 month ago

It didn't work, I get the error message:

Traceback (most recent call last):
  File "//DLKcat.py", line 26, in <module>
    fingerprint_dict = load_pickle('input/fingerprint_dict.pickle')
  File "//DLKcat.py", line 24, in load_pickle
    return pickle.load(f)
_pickle.UnpicklingError: invalid load key, 'v'.

When I run with 0.1 instead of 0.1-multiarch there is no problem, so the solution I first thought I had found (#396) does not resolve this.

I also tried on Apple Silicon and encountered the same error.

mihai-sysbio commented 3 weeks ago

I also tried on Apple Silicon and encountered the same error.

Thanks @SilentWaveSW for confirming - could you please follow in #396?

edkerk commented 3 weeks ago

The "solution" in #396 does not work. In that Issue there is a comment:

I'm wondering if it could have something to do with the new packages that are used in obtaining the new image.

I can test this by again making a 0.1-amd64-only container, but if these changes are made to the Dockerfile and requirements.txt it should work.