instructkr / LogicKor

한국어 언어모델 다분야 사고력 벤치마크
175 stars 31 forks source link

[Refactor] generator.py to enhance gpu memory util usage #10

Closed sigridjineth closed 7 months ago

sigridjineth commented 7 months ago

Changes

  1. Added new command-line arguments:

    • --batch_size: Allows specifying the batch size for processing data.
    • --num_workers: Allows specifying the number of worker processes for data loading.
  2. Utilized PyTorch's DataLoader and Dataset classes:

    • [x] Implemented a custom QuestionDataset class to wrap the df_questions DataFrame and provide an interface for accessing individual samples.
    • [x] Used DataLoader to efficiently load and batch the data, enabling parallel data loading with multiple worker processes.
    • [x] Used ThreadPoolExecutor to process batches of data in parallel, leveraging multiple CPU cores.
    • [x] Increased num_workers in the DataLoader to enable multi-threaded data loading and overlap data loading with GPU computation.
    • [x] Added prefetch_factor=2 to the DataLoader to prefetch batches to the GPU memory before they are needed for computation.
    • [x] Set pin_memory=True in the DataLoader to use pinned memory for faster data transfer between CPU and GPU.
    • [x] Separated the data processing logic into a process_batch function for better modularity and readability.
    • [x] Introduced a collate_fn to handle batching of data in the DataLoader.

Simple Benchmark

Example Run

python generator_2.py --gpu_devices 1 --model maywell/TinyWand-kiqu --template ./templates/template-EEVE.json --model_len 2048 --batch_size 512 --num_workers 128