z-x-yang / Segment-and-Track-Anything

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.
GNU Affero General Public License v3.0
2.75k stars 332 forks source link

Is it possible to run SAM-track with more than one video? #124

Open berengueradrian opened 8 months ago

berengueradrian commented 8 months ago

I would like to run on my local environment SAM-track tool for tracking for example 100 videos at a time instead of having to go one by one, is it possible to do it right now? If not, could I have some clues on what to modify to be able to pass more than one video without using the UI?

Amazing work by the way, thanks for contributing to the community with great advancements like this one!

I look forward to your response.

yamy-cheng commented 8 months ago

Hi, I apologize, but the current UI is unable to support this task at the moment.Each video without annotation should be interacted with by a human until satisfactory annotation is obtained. Therefore, you must handle each video individually. But if you already have annotations for every video, you can utilize aot-benchmark for inference.

berengueradrian commented 8 months ago

Hi! Thanks for the fast response.

My point is that I would like to run it without the UI, I have all the videos without annotation so I want to perform a tracking of them (for example on some of them, getting the birds that appear for training a model then to recognize the actions that they perform, I would annotate the actions of each bird with timelines after the tracking with SAM-track) and as I have a lot of videos, I wanted to generate the trackings of them by passing them to the tool at once instead of passing them one by one as it is a dataset.

So what I can do is to use the model with the provided weights but passing some videos instead of just one, so yes I will need to use aot_benchmark, the only thing that I wanted to do it with text as the tool does, so that I could explain what I want to get from each one.

yamy-cheng commented 8 months ago

Obtaining satisfactory annotations can be challenging without using the web UI. However, you can try using demo_instseg.ipynb and make modifications to achieve your goal.

berengueradrian commented 8 months ago

Ok, thank you very much for your help, I will try and do that.

berengueradrian commented 8 months ago

Hi @yamy-cheng, I am so sorry to bother but I am having too many problems trying to build a Dockerfile to try and execute it on a server for being able to manage it myself and modifying it to handle more than one video. The thing is that due to the incompatibilities of the required libraries, I am not able to build the Dockerfile. I was wondering I you could have any hint for being able to build it. You can see in detail my error here https://github.com/ClementPinard/Pytorch-Correlation-extension/issues/104, I attach my Dockerfile #here:

# Use an NVIDIA CUDA base image
FROM nvcr.io/nvidia/cuda:10.2-cudnn7-runtime-ubi8
#FROM nvidia/cuda:12.1.0-devel-ubi8

# Install essential utilities
RUN dnf install -y \
    git \
    wget \
    curl \
    cmake \
    gcc-c++ \
    python3-devel\
    python3-pip \
    protobuf-compiler\
    && dnf clean all
    #&& rm -rf /var/lib/apt/lists/*

# Download and install Miniconda
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O /miniconda.sh \
    && bash /miniconda.sh -b -p /miniconda \
    && rm /miniconda.sh

# Add Miniconda to PATH
ENV PATH="/miniconda/bin:${PATH}"

# Create Conda environment and install ONNX
RUN conda create --name=onnx_py37 python=3.7.2 protobuf=3.6.1 pybind11=2.2 numpy scipy pytest \
    && echo "source activate onnx_py37" > ~/.bashrc \
    && source ~/.bashrc \
    && conda install -c conda-forge onnx

# Install PyTorch and TorchVision
RUN pip install torch==2.1.0 torchvision

# Upgrade pip to the latest version
RUN python3 -m pip install --upgrade pip

# Install CUDA toolkit and cuDNN (modify versions if needed)
#RUN dnf install -y cuda-toolkit-10-2 libcudnn8-devel-8.0.4.30-1.cuda10.2 \
#    && dnf clean all

# Set CUDA_HOME environment variable
RUN export CUDA_HOME="/usr/local/cuda"

# Install SAM and its dependencies
RUN pip install git+https://github.com/facebookresearch/segment-anything.git
RUN pip install Cython
RUN pip install opencv-python-headless Pillow pycocotools matplotlib
#onnxruntime onnx

# Install DeAOT
RUN git clone https://github.com/yoxu515/aot-benchmark.git

# Install Pytorch Correlation from PyPI
#RUN git clone https://github.com/ClementPinard/Pytorch-Correlation-extension.git
#WORKDIR /workdir/Pytorch-Correlation-extension
#RUN python3 setup.py install
#WORKDIR /workdir
RUN pip install spatial-correlation-sampler

# Clone the SAM-Track repository
RUN git clone https://github.com/z-x-yang/Segment-and-Track-Anything.git

# Change to the SAM-Track directory
WORKDIR /workdir/Segment-and-Track-Anything

# Install SAM-Track
RUN bash script/install.sh

# Download default weights
RUN mkdir ./ckpt
RUN bash script/download_ckpt.sh

# Install gradio inside the project's folder
RUN pip install gradio==3.39.0

# Set the working directory to run the app
WORKDIR /workdir/Segment-and-Track-Anything

# Run the UI app
CMD ["python3", "app.py"]