intel-analytics / ipex-llm-tutorial

Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using ipex-llm
https://github.com/intel-analytics/bigdl
Apache License 2.0
142 stars 37 forks source link

Some issues in the tutorial (link changed, datasets deprecated, etc.) #55

Closed Mingyu-Wei closed 10 months ago

Mingyu-Wei commented 11 months ago
  1. BigDL LLM package installation: I notice that in some chapters the package installation suggestion is :

    pip install --pre --upgrade bigdl-llm[all]

    However, I also see:

    pip install bigdl-llm[all]

    in some files. Is it necessary to unify this command?

  2. link outdated: The links at the end of chapter 1 are all outdated.

image

We have already verified many models on BigDL-LLM and provided ready-to-run examples, such as [Llama](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/transformers/native_int4), 
[Llama2](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/transformers/transformers_int4/llama2), 
[Vicuna](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/transformers/transformers_int4/vicuna), 
[ChatGLM](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/transformers/transformers_int4/chatglm),
[ChatGLM2](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/transformers/transformers_int4/chatglm2), 
[Baichuan](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/transformers/transformers_int4/baichuan),
[MOSS](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/transformers/transformers_int4/moss), 
[Falcon](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/transformers/transformers_int4/falcon),
[Dolly-v1](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/transformers/transformers_int4/dolly_v1),
[Dolly-v2](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/transformers/transformers_int4/dolly_v2),
StarCoder([link1](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/transformers/native_int4),
[link2](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/transformers/transformers_int4/starcoder)),
Phoenix([link1](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/transformers/native_int4),
[link2](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/transformers_int4/phoenix)),
RedPajama([link1](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/transformers/native_int4),
[link2](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/transformers/transformers_int4/redpajama)),
[Whisper](https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/transformers/transformers_int4/whisper), etc. You can find model examples [here](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/transformers_int4).

All the links above have already been changed, so they all lead to 404 Page Not Found. For example, the current address of Llama2 in the tutorial is: https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/transformers/native_int4, but the folder structure has been updated and it should be https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/llama2 now.

  1. load_in_low_bit options outdated: In chapter 5/6, the load_in_low_bit options are:

    image

    The latest version of bigdl-llm 2.4.0 supports sym_int4, asym_int4, sym_int5, asym_int5, sym_int8, nf3, nf4, fp4, fp8, fp16 or mixed_4bit. And for GPU in chapter 6, the supported options are: sym_int4, asym_int4, sym_int5, asym_int5, sym_int8, nf3, nf4, fp4, fp8, fp16, mixed_fp4 or mixed_fp8

  2. Sample audio files in chapter 5.2 deprecated:

image The common voice dataset is deprecated and will be deleted soon according to their hugging face. As for the audio files(audio_en.mp3/audio_zh.mp3) downloaded in the wget command, these files are already removed from hugging face. Using these files will lead to EOF error when running the sample code in this section.

  1. GPU Acceleration Environment Setup
image

In this image, the command source /opt/intel/oneapi/setvars.sh is listed as a recommendation for Intel GPU acceleration. However, based on my own knowledge and experience, this command is mandatory and should be used whenever a new terminal session is created. Otherwise we might encounter this error OSError: libmkl_intel_lp64.so.2: cannot open shared object file: No such file or directory. This is not exactly an error, but I believe it might be better to highlight this command in the tutorial, either in README.md or in 6_1_GPU_Llama2-7B.md

Ariadne330 commented 11 months ago

I fix these issues in this #PR , these are some details:

  1. Replace all the pip install bigdl-llm[all] command with pip install --pre --upgrade bigdl-llm[all]
  2. Update all the links in Chapter 1 with valid HF_model links.
  3. Update the load_in_low_bit options, for example for CPU
    Currently, `load_in_low_bit` supports options `'sym_int4'`, `'asym_int4'`, `'sym_int5'`, `'asym_int5'` or `'sym_int8'`, in which 'sym' and 'asym' differentiate between symmetric and asymmetric quantization. `'nf3'` and `'nf4'` stand for normalFloat quantizations . Floating point precision `'fp4'`, `'fp8'`, `'fp16'` and Option `'mixed_4bit'` are also supported.
  4. After searching the latest common_voice dataset and fail to get licsence-free Chinese and English audio files, I searched HuggingFace for another two audio files: an English example from English audio dataset librispeech_asr_dummy and one Chinese example from the Chinese audio dataset AIShell. Accordingly I updated the experiments results and rerun all the cells in this chapter.
  5. Add the GPU Acceleration environment setup in both Chapter 6 and chapter 7(I think this suggestion is necessary for those not familiar with bigdl-llm, so I added this line wherever using GPU acceleration)