Open zeon256 opened 1 year ago
Hello,
It is difficult without seeing the Python code side by side for comparison, but could it be that the Python model is loaded in half precision (fp16)?
Hello,
It is difficult without seeing the Python code side by side for comparison, but could it be that the Python model is loaded in half precision (fp16)?
Hello,
Apologies for not being clear, this is the reference impl I use. Its calling this from the Instructor library provided by the authors of instructor
Hello, I wrote a project which uses rust-bert. However, I noticed that loading the same model in python uses 1/2 of what my rust implementation uses even when I only load once. Any idea how to fix this? Any help would be appreciated. Thanks!
Extract from nvidia-smi (Python)
Extract from nvidia-smi (Rust)