HazyResearch / hyena-dna

Official implementation for HyenaDNA, a long-range genomic foundation model built with Hyena
https://arxiv.org/abs/2306.15794
Apache License 2.0
544 stars 77 forks source link

git-lfs missing from container #14

Closed KatharinaHoff closed 8 months ago

KatharinaHoff commented 10 months ago

Hi!

Thanks for this very cool repository, the preprint is very cool, too!

I am a total beginner to using relevant machine learning libraries, but I think the git-lfs is missing from the docker container of hyena-dna. Admittedly, I did a bit of weird stuff with it because I can't execute docker on the HPC. I converted it to Singularity. Nevertheless, I'd expect to be able to call git-lfs from there, too, and it seems to be missing.

Here's what I did:

# build image from existing docker container
singularity build hyena-dna.sif docker://hyenadna/hyena-dna-public:latest

# can not use the hyena-dna inside the container because it tries to write into the same folder, therefore using a local clone, container only holds the dependencies
git clone https://github.com/HazyResearch/hyena-dna.git
cd hyena-dna

# SINGULARITYENV_CUDA_VISIBLE_DEVICES=1 says "use GPU_1" (it will otherwise use GPU_0 by default)
SINGULARITYENV_CUDA_VISIBLE_DEVICES=1 singularity exec --nv ~/images/hyena-dna.sif python -m train wandb=null experiment=hg38/genomic_benchmark_scratch # works like charm! 

SINGULARITYENV_CUDA_VISIBLE_DEVICES=1 singularity exec --nv ~/images/hyena-dna.sif python -m huggingface # fails

Error:

Using device: cuda
git: 'lfs' is not a git command. See 'git --help'.

The most similar command is
    log
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/runpy.py", line 185, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/opt/conda/lib/python3.8/runpy.py", line 111, in _get_module_details
    __import__(pkg_name)
  File "/home/hoffk83/images/git/hyena-dna/huggingface.py", line 251, in <module>
    inference_single()
  File "/home/hoffk83/images/git/hyena-dna/huggingface.py", line 209, in inference_single
    model = HyenaDNAPreTrainedModel.from_pretrained(
  File "/home/hoffk83/images/git/hyena-dna/huggingface.py", line 106, in from_pretrained
    config = json.load(open(os.path.join(pretrained_model_name_or_path, 'config.json')))
FileNotFoundError: [Errno 2] No such file or directory: './checkpoints/hyenadna-small-32k-seqlen/config.json'
(base) hoffk83@vision-05:~/images/git/hyena-dna$ ls ./checkpoints/hyenadna-small-32k-seqlen/config.json
ls: cannot access './checkpoints/hyenadna-small-32k-seqlen/config.json': No such file or directory
(base) hoffk83@vision-05:~/images/git/hyena-dna$ ls ./checkpoints/hyenadna-small-32k-seqlen/config.json
ls: cannot access './checkpoints/hyenadna-small-32k-seqlen/config.json': No such file or directory

I think it could be fixed by adding git-lfs to the requirements.txt, and then rebuilding the container.

exnx commented 10 months ago

Hi @KatharinaHoff, thanks so much! I forgot this dependency, which I believe is just for loading the pretrained weights from Huggingface. Good catch :) I've made the change and uploaded a new Docker image (reflected in the readme). You can now pull with (I removed the 'public' name):

docker pull hyenadna/hyena-dna:latest

Enjoy!

KatharinaHoff commented 10 months ago

Dear @exnx , thank you so much for trying to fix it. Sadly, the pip install of git-lfs did not fix the problem. Sorry, my bad, I had not tested my first suggestion. I have now figured out how to fix it. Possibly not the most elegant way... but if you append the following to your Dockerfile, then both the docker and the singularity built contain git lfs and models can be loaded from huggingface:

RUN wget https://github.com/git-lfs/git-lfs/releases/download/v3.4.0/git-lfs-linux-amd64-v3.4.0.tar.gz && \
    tar -xvf git-lfs-linux-amd64-v3.4.0.tar.gz && \
    cd git-lfs-3.4.0 && \
    ./install.sh && \
    cd .. && \ # I have not tested the last two lines but I think it makes sense to delete the archive; my built still has it
    rm git-lfs-linux-amd64-v3.4.0.tar.gz

This solution, I have tested both with Docker and Singularity. git lfs works.

Maybe you also want to add the Singularity instructions (adapt from my initial post here) to the Readme.md? Just an idea to save other people some time. I tested it with singularity-ce version 3.11.3 , all works well.

exnx commented 10 months ago

Thanks for the update! I haven't been able to test it out myself, but I'll report back when I do.

exnx commented 9 months ago

I ended making a second Docker image with the Nucleotide Transformer datasets, and weights to reproduce the results from our paper. This new image includes the correct git-lfs dependency for pulling in weights from Huggingface. You can find the image here:

# pull image
docker pull hyenadna/hyena-dna-nt6:latest 

# run container
docker run --gpus all -it -p80:3000 hyenadna/hyena-dna-nt6 /bin/bash

To build the image, I used tips from this thread, which basically just means adding:

RUN curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
RUN sudo apt-get install git-lfs

Eventually I'll add this to the main Dockerfile in the repo, but for now there are 2 Docker images.

salvatoreloguercio commented 8 months ago

Hi @KatharinaHoff and @exnx , thanks a lot for posting directions on how to generate and use a Singularity image of HyenaDNA! I tried as Katharina suggested:

singularity build hyena-dna.sif docker://hyenadna/hyena-dna-public:latest
git clone https://github.com/HazyResearch/hyena-dna.git
cd hyena-dna
SINGULARITYENV_CUDA_VISIBLE_DEVICES=1 singularity exec --nv ~/images/hyena-dna.sif python -m train wandb=null experiment=hg38/genomic_benchmark_scratch

But getting /opt/conda/bin/python: No module named train Anything missing on my side? Thanks!

exnx commented 8 months ago

The steps above by Katharina didn't work for me. Instead I used a different set of commands, which you can find in this image instead, on dockerhub. You can find the steps in the readme. I'm not especially familiar with Singularity, but there's a different step when starting the image that tunnels your local directory to the container, otherwise you're just getting the environment, but not any of the code.

hyenadna/hyena-dna-nt6:latest
salvatoreloguercio commented 8 months ago

Great, thank you! As first thing, I tried the nt7 version also available - but the converted sif image was giving errors. Will try converting nt6 and let you know.

On Tue, Nov 14, 2023, 9:03 PM Eric Nguyen @.***> wrote:

The steps above by Katharina didn't work for me. Instead I used a different set of commands, which you can find in this image instead.

hyenadna/hyena-dna-nt6:latest

— Reply to this email directly, view it on GitHub https://github.com/HazyResearch/hyena-dna/issues/14#issuecomment-1811813642, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB7IT6G3WJBOD6DH6K6FYKTYEREJFAVCNFSM6AAAAAA4GHLP6SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJRHAYTGNRUGI . You are receiving this because you commented.Message ID: @.***>

exnx commented 8 months ago

The nt6 image should be using commands from here. I forgot if nt7 did too or not, might've been testing other things.

But specifically, the commands you want in the Dockerfile are:

RUN curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | bash
RUN apt-get install -y git-lfs

I updated the Dockerfile in this repo, it now includes this command if you build your own image.

salvatoreloguercio commented 8 months ago

Thanks! I re-run with hyena-dna-nt6, getting:

singularity build hyena-dna_nt6.sif docker://hyenadna/hyena-dna-nt6:latest
cd hyena-dna
singularity exec --nv hyena-dna_nt6.sif python -m train wandb=null experiment=hg38/genomic_benchmark_scratch
13:4: not a valid test operator: (
13:4: not a valid test operator: 510.47.03
/usr/bin/python: No module named train

This strange 'not a valid test operator' is the same I was getting with the sif image of hyena-dna-nt7 actually.

Wish I could use Docker - but I am stuck with Singularity on the HPC A100 nodes I have available.

exnx commented 8 months ago

ChatGPT? ie, check for how to make the docker cmd provided into an equivalent singularity cmd. Sorry, we don't support singularity on our end, we just don't use it.

exnx commented 8 months ago

eg

apptainer pull docker://hyenadna/hyena-dna-nt7:latest
apptainer exec --nv docker://hyenadna/hyena-dna-nt7:latest /bin/bash
salvatoreloguercio commented 8 months ago

No worries, thank you for all the help. I think I found the culprit - my cloned hyenaDNA folder was on a mounted drive (/mnt/ etc.) that for some reason wasn't accessible by the container image. Now moved everything on my home folder and it seems to work. If I have further questions I will reach out on Discord. Thanks again!