Quantize Per-Trained model Using QLoRa or LoRa , PFET Technique

I would like to ask how can I use QLoRa or Parameter-Efficient Fine-Tuning thin a model does not register at Hugging face instead is Based on OFA

i am trying to Quantize the Tiny version but I don’t know if I need to use Lora in which way for Parameter-Efficient Fine-Tuning

i thought if i reconstruct the model BioMedGPT_Tiny from Unify_Transfomer.py following fie ofa.py and indicate to Config parameters to have BiomedGPT_tiny in separation file then apply Quantization Techniques but the problem is that the tokenizer Pet-Trained model not available i think

taokz / BiomedGPT

Quantize Per-Trained model Using QLoRa or LoRa , PFET Technique #4