Closed senceryucel closed 1 year ago
Hello @senceryucel,
We will soon introduce a new parameter in the user_config.yaml file that allows you to specify the dataset to be used for quantization. Currently, the full training set is used by default to quantize the model. However, by selecting a smaller representative dataset, you can significantly reduce the time required for quantization.
Hello,
Appreciate your work, it works amazing. I'm facing with an issue which I'd like to ask.
I can train my model on my GPU, really fast, without any problem (for my own configuration, it takes approximately 20 seconds for an epoch to finish). However, quantization process takes extremely long (more than 20 mins). After that, evaluating the quantized model phase takes even longer (more than 30 mins). Therefore, for a 20 epoch training: train phase takes approximately 4 mins where the other processes takes almost an hour in total.
Here are the configs I use:
I have 2 GPUs. GPU_0 is used for the training, but it does not free up the memory after the training. Here is the GPU usages while quantizing the model:
Here, GPU_0's usage is the same as the usage in the train phase, and GPU_1 is not even being used by the script at all.
What can I do to reduce this quantization time? As far as I know, this should take at most 6-7 mins.
Thanks a lot.