Open mdelhaous opened 4 months ago
This is likely caused by using xformers operations while running on cpu, which generally doesn't work. If you know you're going to be running on a cpu-only system, it might be best to avoid installing xformers altogether, since it seems (in my experience) to go out and pull other gpu requirements.
If you can't uninstall xformers, or just want to leave it in to support cuda systems where possible, it looks like the dinov2 model includes support for disabling xformers by setting an environment variable: XFORMERS_DISABLED
This is likely caused by using xformers operations while running on cpu, which generally doesn't work. If you know you're going to be running on a cpu-only system, it might be best to avoid installing xformers altogether, since it seems (in my experience) to go out and pull other gpu requirements.
If you can't uninstall xformers, or just want to leave it in to support cuda systems where possible, it looks like the dinov2 model includes support for disabling xformers by setting an environment variable:
XFORMERS_DISABLED
but why the API works locally and not in the container (I use the same requirements on my local)
Locally, the API runs on a system with CUDA support
If when running outside of docker there is cuda support, then I think xformers will use that without any issues. If on that same machine it doesn't work inside docker, then it's likely an issue with the docker image not having the right cuda dependencies or not having gpu passthrough working (both of which are a pain to debug in my experience).
The issue is resolved! I added the following line to my Dockerfile:
ENV XFORMERS_DISABLED=1
Thank you so much @heyoeyo
Hello @heyoeyo,
Hope you are doing well
Issues:
AWS Deployment Error:
When running the Docker container on an AWS machine, I encounter the following error: WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested exec /opt/nvidia/nvidia_entrypoint.sh: exec format error
Mac Build Error:
When building the Docker image on a Mac, I encounter the following error: ERROR: Cannot install torch==2.0.1 and torchvision==0.15.1 because these package versions have conflicting dependencies. ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts
Goal:
I aim to create multi-architecture Docker builds to ensure compatibility across all platforms (e.g., AMD64, ARM64). I'm not entirely sure about Mac support for DINOv2 yet, but I'm looking for a viable solution.
Attempts:
Questions:
Thank you in advance for your help!
- What are the best practices for creating multi-architecture Docker builds for such projects?
I've only ever worked with x86, so I'm not familiar with ARM devices/deployments or how best to handle multi-architecture builds unfortunately, sorry!
- How can I resolve the exec format error on AWS?
As far as I can tell, the error is just saying that the docker image is for x86, but it's trying to run on an ARM device. It looks like some images on docker hub have both x86 & ARM versions, but the base image you're using may not have this (at a quick glance, the pytorch base images only seem to be available in x86, for example), so that may be the issue. If possible, switching to a base image that is already built with ARM support may resolve the issue.
- How can I manage the dependency conflict for torch and torchvision during the Docker build on Mac?
The simplest thing to try would be to loosen the strictness of the requirements, for example by using torch==2.0.*
and torchvision==0.15.*
, so that more versions are allowed when resolving the dependencies.
If you want broader support, what I would probably do is first install without version restrictions to see what version gets installed (using pip list) and treat that as an upper bound. Let's say that gives you torch 2.3.1 and torchvision 0.18.1.
Then re-install, but use restrictions like: torch<2.3
and torchvision<0.18
, and see what that installs. You can keep trying to reduce the versioning until your scripts no longer work. That should give a lower/upper bound on versions that you know will work, which you can then use for your requirements to allow them to be more flexible. I've done something like this in this requirements.txt file for example.
Hello @heyoeyo,
First of all Thank you so much for you responses!
I'm encountering a dependency conflict when building a Docker image for the linux/arm64 platform. The build fails with the following error, but building for the default platform works fine.
Issue: The build for the linux/arm64 platform fails due to version conflicts between torch, torchvision, torchmetrics, and xformers. However, building for the default platform works without issues. Questions:
Any insights or suggestions would be greatly appreciated!
Working Build Command: docker build -t myfastapiapp .
Failing Build Command: docker build --platform linux/arm64 -t myfastapiapp-arm64 .
Error Message (for linux/arm64):
`ERROR: Cannot install -r requirements.txt (line 3), -r requirements.txt (line 5), -r requirements.txt (line 8) and torch==2.0.0 because these package versions have conflicting dependencies.
The conflict is caused by: The user requested torch==2.0.0 torchvision 0.15.0 depends on torch torchmetrics 0.10.3 depends on torch>=1.3.1 xformers 0.0.27 depends on torch>=2.2
To fix this you could try to:
Based on the error:
xformers 0.0.27 depends on torch>=2.2
It looks like having torch 2.0.0 and xformers 0.0.27 isn't possible.
xFormers is very specific to use with cuda devices, which I'm not sure even work with ARM...? So one easy fix may just be to remove the xformers altogether for the ARM build (dinov2 will work without it).
Alternatively, I think you can either downgrade the xformers version to a lower version that works with torch 2.0.0, or upgrade torch to >2.2. For example, on my local copy of dinov2 I have torch 2.0.0 with xformers 0.0.18, so xformers==0.0.18
may fix the problem. Though assuming dinov2 still works with newer pytorch versions (i.e. >2.2), it's probably better to upgrade the torch requirement, just for better longevity.
Thank you soo much @heyoeyo
I'm currently working on an API using FastAPI to serve DINOv2 models from the official DINOv2 repository. The API works well locally, but when I run it in a Docker container, I encounter an error related to the memory-efficient attention forward operator.
API.py
`from fastapi import FastAPI, File, UploadFile, HTTPException, Form from fastapi.responses import JSONResponse import torch from PIL import Image, UnidentifiedImageError from torchvision import transforms import io import requests # Import the requests library import logging
logging.basicConfig(level=logging.INFO) logger = logging.getLogger(name)
app = FastAPI()
MODEL_MAP = { 'dinov2_vitl14': 'dinov2_vitl14', 'dinov2_vits14': 'dinov2_vits14', 'dinov2_vitb14': 'dinov2_vitb14', 'dinov2_vitg14': 'dinov2_vitg14', }
def load_model(model_name: str): if model_name not in MODEL_MAP: raise ValueError(f"Model {model_name} is not supported.") model = torch.hub.load('facebookresearch/dinov2', MODEL_MAP[model_name]) model.eval() return model
def preprocess_image(image): input_image = Image.open(io.BytesIO(image)).convert('RGB') preprocess = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), ]) input_tensor = preprocess(input_image) input_batch = input_tensor.unsqueeze(0) # Create a mini-batch as expected by the model return input_batch
def infer(model, input_batch): with torch.no_grad(): output = model(input_batch) return output
@app.post("/infer/") async def infer_image(file: UploadFile = File(...), model_name: str = Form(...)): try: if model_name not in MODEL_MAP: raise HTTPException(status_code=400, detail="Invalid model name provided.")
@app.post("/infer-url/") async def infer_image_url(url: str = Form(...), model_name: str = Form(...)): try: if model_name not in MODEL_MAP: raise HTTPException(status_code=400, detail="Invalid model name provided.")
if name == "main": import uvicorn uvicorn.run(app, host="0.0.0.0", port=8000)`
Dockerfile
` FROM python:3.9-slim WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY . . EXPOSE 8000 CMD ["uvicorn", "API:app", "--host", "0.0.0.0", "--port", 8000"]
`
requirements.txt: ` --extra-index-url https://download.pytorch.org/whl/cu117
torch==2.0.0
torchvision==0.15.0
omegaconf
torchmetrics==0.10.3
fvcore
iopath
xformers==0.0.18
submitit
--extra-index-url https://pypi.nvidia.com
cuml-cu11
fastapi
Pillow
requests
uvicorn `
Problem: When I run the API locally, both endpoints work fine. However, when running the API in the Docker container, I get the following error:
{ "detail": "No operator found for memory_efficient_attention_forward with inputs:\n query : shape=(1, 257, 6, 64) (torch.float32)\n key : shape=(1, 257, 6, 64) (torch.float32)\n value : shape=(1, 257, 6, 64) (torch.float32)\n attn_bias : <class 'NoneType'>\n p : 0.0\ncutlassF is not supported because:\n device=cpu (supported: {'cuda'})\nflshattF is not supported because:\n device=cpu (supported: {'cuda'})\n dtype=torch.float32 (supported: {torch.bfloat16, torch.float16})\ntritonflashattFis not supported because:\n device=cpu (supported: {'cuda'})\n dtype=torch.float32 (supported: {torch.bfloat16, torch.float16})\n Operator wasn't built - seepython -m xformers.info for more info\n triton is not available\nsmallkFis not supported because:\n max(query.shape[-1] != value.shape[-1]) > 32\n unsupported embed per head: 64" }
Additional Information: The error seems to indicate that the CPU is not supported for the memory-efficient attention forward operator. Locally, the API runs on a system with CUDA support. In the Docker container, the application seems to be running on the CPU. Question: How can I resolve this error and make the API work correctly in the Docker container?
NB: I DONT HAVE NVIDIA GPU ON MY MACHINE. Any insights or suggestions would be greatly appreciated. Thank you!