Windows CUDA Make chat problem

I am trying to use this solution on windows with CUDA with capability 8.6. I am running into an issue relating to the function LLaVAGenerate not being compiled during linking, as shown in the screenshot below

Steps to replicate:

Environment:

Visual Studio 2022 14.29.30133's cl.exe
CUDA Took kit 12
LLaMA2 13B AWQ int4 model using command python tools/download_model.py --model LLaMA2_13B_chat_awq_int4 --QM QM_CUDA
pthread package from vcpkg by directly linking the include and lib files in the project
PATH: /c/CUDA/v12/libnvvp:/c/CUDA/v12/bin:/c/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64:/ucrt64/bin:/usr/local/bin:/usr/bin:/bin:/c/Windows/System32:/c/Windows:/c/Windows/System32/Wbem:/c/Windows/System32/WindowsPowerShell/v1.0/:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl

Fixed a few issues with NUM_THREAD not being defined and tanhf not being defined, built using command make chat -j

my guess the only reference to LLaVAGenerate is in the non_cuda directory, maybe it is being omitted from compiled? Note compiling with CPU flag works fine and I can get output from the LLM

mit-han-lab / TinyChatEngine

Windows CUDA Make chat problem #92