MAGICS-LAB / DNABERT_2

[ICLR 2024] DNABERT-2: Efficient Foundation Model and Benchmark for Multi-Species Genome
Apache License 2.0
212 stars 49 forks source link

problem stll in environment #71

Closed xueleecs closed 4 months ago

xueleecs commented 4 months ago

I have create a new Virtual environment, and follow your read me.

create and activate virtual python environment

conda create -n dna python=3.8 conda activate dna

install required packages

python3 -m pip install -r requirements.txt

but it still can not run . this problem is may about cuda (From chatGPT), and if I should 1.pip uninstall torch

  1. conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch (my Cuda is 11.4) Can you tell me if I am correct in these two steps?
Zhihan1996 commented 4 months ago

What's your current error message?

xueleecs commented 4 months ago

no obvious error, just the following, so I asked GPT.

AssertionError Traceback (most recent call last) Cell In[4], line 3 1 dna = "ACGTAGCATCGGATCTATCTATCGACACTTGGTTATCGATCTACGAGCATCTCGTTAGC" 2 inputs = tokenizer(dna, return_tensors = 'pt')["input_ids"] ----> 3 hidden_states = model(inputs)[0] # [1, sequence_length, 768] 5 # embedding with mean pooling 6 embedding_mean = torch.mean(hidden_states[0], dim=0)

File ~/.conda/envs/dnat/lib/python3.8/site-packages/torch/nn/modules/module.py:1511, in Module._wrapped_call_impl(self, *args, kwargs) 1509 return self._compiled_call_impl(*args, *kwargs) # type: ignore[misc] 1510 else: -> 1511 return self._call_impl(args, kwargs) ....... File ~/.cache/huggingface/modules/transformers_modules/DNABERT-2-117M/flash_attn_triton.py:1021, in _FlashAttnQKVPackedFunc.forward(ctx, qkv, bias, causal, softmax_scale) 1019 if qkv.stride(-1) != 1: 1020 qkv = qkv.contiguous() -> 1021 o, lse, ctx.softmax_scale = _flash_attn_forward( 1022 qkv[:, :, 0], 1023 qkv[:, :, 1], 1024 qkv[:, :, 2], 1025 bias=bias, 1026 causal=causal, 1027 softmax_scale=softmax_scale) 1028 ctx.save_for_backward(qkv, o, lse, bias) 1029 ctx.causal = causal

File ~/.cache/huggingface/modules/transformers_modules/DNABERT-2-117M/flash_attn_triton.py:781, in _flash_attn_forward(q, k, v, bias, causal, softmax_scale) 778 assert q.dtype == k.dtype == v.dtype, 'All tensors must have the same type' 779 assert q.dtype in [torch.float16, 780 torch.bfloat16], 'Only support fp16 and bf16' --> 781 assert q.is_cuda and k.is_cuda and v.is_cuda 782 softmax_scale = softmax_scale or 1.0 / math.sqrt(d) 784 has_bias = bias is not None

And GPT tell me : It looks like the error is occurring within a custom implementation of attention mechanism (_FlashAttnQKVPackedFunc.forward). The assertion error is raised because the tensors q, k, and v are expected to be on a CUDA device (GPU), but the check assert q.is_cuda and k.is_cuda and v.is_cuda fails.

To resolve this issue, make sure that the tensors involved in the attention mechanism are on the same device. You can achieve this by explicitly moving the tensors to the GPU using the .to(device) method. so I print the device: device = torch.device("cuda" if torch.cuda.is_available() else "cpu") output: false So I think I should uninstall torch to cuda these: pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch (my Cuda is 11.4)

Zhihan1996 commented 4 months ago

Please try "pip uninstall triton".

xueleecs commented 4 months ago

Yes, I create a new environment,and I do not pip triton.

Zhihan1996 commented 4 months ago

It automatically install triton so you need to manually remove it.

xueleecs commented 4 months ago

YES, you are right!!!! Thank you so much ~