How to reduce CPU usage?

I use this code to load the model:

model = 'WizardLM/WizardCoder-15B-V1.0'

def load_model(model = model):
    tokenizer = AutoTokenizer.from_pretrained(model)
    model = AutoModelForCausalLM.from_pretrained(model, device_map=device_map, load_in_8bit = True)
    return tokenizer, model

tokenizer, model = load_model(model)

And this code to generate:

          generation_config = GenerationConfig(
              temperature=0.0,
              top_p=0.95,
              top_k=50,
              eos_token_id=tokenizer.eos_token_id,
              pad_token_id=tokenizer.pad_token_id,
          )

      prompt_template = f'''
      Below is an instruction that describes a task. Write a response that appropriately completes the request

      ### Instruction: {prompt}

      ### Response:'''

      inputs = tokenizer(prompt_template, return_tensors="pt").to("cuda")
      generated_ids = model.generate(**inputs, generation_config=generation_config, max_new_tokens=6000)
      outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)

This model fit at all into my GPU, but for some reason the GPU not even used (it is not heating while generating), but proccesor usage is 100% What`s wrong with my code or problem is in the model?

tensorflow / models

How to reduce CPU usage? #11064