Efficient-Large-Model / VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Apache License 2.0
878 stars 55 forks source link

RuntimeError: GET was unable to find an engine to execute this computation #60

Open pribadihcr opened 1 month ago

pribadihcr commented 1 month ago

when run this script

python -W ignore llava/eval/run_vila.py \
    --model-path Efficient-Large-Model/VILA1.5-3b \
    --conv-mode vicuna_v1 \
    --query "<video>\n Please describe this video." \
    --video-file "demo.mp4"

with disable #from tf_utils import flatten, shape_list in VILA/llava/model/multimodal_encoder/image_processor.py

Got the following error:

/envs/vila/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: GET was unable to find an engine to execute this computation
Efficient-Large-Language-Model commented 1 month ago

why do you disable #from tf_utils import flatten, shape_list ?