intel / intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
Apache License 2.0
2.1k stars 206 forks source link

[Questions] some questions on the model conversion side #743

Closed park12sj closed 9 months ago

park12sj commented 9 months ago

Hello, I am trying to convert a model that is not on the itrex support list by referring to that guide.

I have some questions.

  1. In the case of build from source, which path should I run?
    # Linux
    git submodule update --init --recursive
    mkdir build
    cd build
    cmake .. -G Ninja
    ninja

    There is no CMakeLists.txt file on the top path.

However, even if i run it on the intel_extension_for_transformers/llm/runtime/graph path with CMakeLists.txt, .so files don't come up. (ex) intel_extension_for_transformers/llm/runtime/graph/gptneox_cpp.cpython-310-x86_64-linux-gnu.so)

  1. Please explain in detail how new_model_mem_req is calculated. https://github.com/intel/intel-extension-for-transformers/blob/e0144c0a1f81033b8c58b28945fc476b20ee2d86/intel_extension_for_transformers/llm/runtime/graph/models/gptneox/gptneox.h#L26-L37

  2. I wonder how gptneox.cpp refer to configurations that are not done fout.write in convert_gptneox.py.

for example, self.final_layer_norm in huggingface gptneox use eps=config.layer_norm_eps. but config.layer_norm_eps is not done fout.write in convert_gptneox.py. How was the self.final_layer_norm transplanted into the cpp code?

Also,I guess config name that was done fout.write is also different in cpp code. https://github.com/intel/intel-extension-for-transformers/blob/adb109bad9d4fd83586989991543f5cc2b655b9e/intel_extension_for_transformers/llm/runtime/graph/models/model_utils/model_types.h#L112-L138 I'm guessing as below, is that correct?

vocab_size -> n_vocab
hidden_size -> n_layer
num_attention_heads -> n_head
...

What I'm curious about is how a particular config in cpp matches a particular config in python. Even though convert_gptneox.py only writes the value of hparam and not the name of hparam.

airMeng commented 9 months ago

However, even if i run it on the intel_extension_for_transformers/llm/runtime/graph path with CMakeLists.txt, .so files don't come up. (ex) intel_extension_for_transformers/llm/runtime/graph/gptneox_cpp.cpython-310-x86_64-linux-gnu.so)

"build from source" means the bare metal solution in which you indeed should run under intel_extension_for_transformers/llm/runtime/graph and no full python package will be available. If you want to use the top API, you should try pip install -v . under the top directory.

Please explain in detail how new_model_mem_req is calculated.

unfortunately this is an estimation, we will upgrade precise calculation soon.

park12sj commented 9 months ago

@airMeng

I didn't understand it very well. I want to create a .so file of our model. ex) intel_extension_for_transformers/llm/runtime/graph/{OUR_MODEL}_cpp.cpython-310-x86_64-linux-gnu.so

Are you saying that I can write CMakeLists.txt well and run pip install with git branch base?

Please answer question 3.

zhenwei-intel commented 9 months ago

Hi @park12sj,

However, even if i run it on the intel_extension_for_transformers/llm/runtime/graph path with CMakeLists.txt, .so files don't come up. (ex) intel_extension_for_transformers/llm/runtime/graph/gptneox_cpp.cpython-310-x86_64-linux-gnu.so)

If you want to compile .so file, you need to turn on NE_PYTHON_API. This option is turned on by default using pip install. cmake .. -DNE_PYTHON_API=ON

Please answer question 3.

The order of writing parameters in the convert script must be consistent with the order of read_hparams. If there are new parameters (not in the current list), you need to expand the struct model_hparams

park12sj commented 9 months ago

Hi, @zhenwei-intel

When executing pip install, a gnu version error occurs as shown below. Is there any version information I need?

      CMake Error in CMakeLists.txt:
        The compiler feature "cxx_std_17" is not known to CXX compiler

        "GNU"

        version 4.8.5.
airMeng commented 9 months ago

https://en.wikipedia.org/wiki/C%2B%2B17#Compiler_support According to wiki, at least GCC8.0 is needed for C++17

park12sj commented 9 months ago

I'm using gcc version 11 and there's a corresponding cmake error. That's weird... I'll look in a different direction.

[root@personal-kr21041-0]/workspace/storage/cephfs-personal/git/pai/opensource_in_paip/paip-itrex# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt/rh/devtoolset-11/root/usr/libexec/gcc/x86_64-redhat-linux/11/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,lto --prefix=/opt/rh/devtoolset-11/root/usr --mandir=/opt/rh/devtoolset-11/root/usr/share/man --infodir=/opt/rh/devtoolset-11/root/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --with-default-libstdcxx-abi=gcc4-compatible --enable-plugin --enable-initfini-array --with-isl=/builddir/build/BUILD/gcc-11.2.1-20220127/obj-x86_64-redhat-linux/isl-install --enable-gnu-indirect-function --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 11.2.1 20220127 (Red Hat 11.2.1-9) (GCC) 
  CMake Error in CMakeLists.txt:
    The compiler feature "cxx_std_17" is not known to CXX compiler

    "GNU"

    version 4.8.5.
airMeng commented 9 months ago

I'm using gcc version 11 and there's a corresponding cmake error. That's weird... I'll look in a different direction.

Can you confirm the version of g++, sometimes g++ and gcc may be in different version and The compiler feature "cxx_std_17" is not known to CXX compiler also suggests the wrong version of g++.

By the way, if you want to use the bf16-related features of the itrex backend, it is best to upgrade the gcc(and g++) version to 11 or above. If you want to use the fp16-related features, you must upgrade the gcc(and g++) version to 13 as commented in https://github.com/intel/intel-extension-for-transformers/issues/726#issuecomment-1820037722