facebookresearch / spiritlm

Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".
Other
633 stars 39 forks source link

i can't generate audio #6

Open shingo-vokov opened 6 days ago

shingo-vokov commented 6 days ago

i try use it

outputs = spirit_lm.generate(
    interleaved_inputs=[('text', "I am so deeply saddened, it feels as if my heart is shattering into a million pieces and I can't hold back the tears that are streaming down my face.")],
    output_modality='speech',
    generation_config=GenerationConfig(
        temperature=0.8,
        top_p=0.95,
        max_new_tokens=200,
        do_sample=True,
    ),
    speaker_id=1,
)
display_outputs(outputs)

but i see errors

/home/.conda/envs/spiritlm/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:579: UserWarning: pad_token_id should be positive but got -1. This will cause errors when batch generating, if there is padding. Please set pad_token_id explicitly as model.generation_config.pad_token_id=PAD_TOKEN_ID to avoid errors in generation warnings.warn(

how to get PAD_TOKEN_ID

hitchhicker commented 6 days ago

Could you share the version of your transformers version and how you setup the conda environment? Thanks! I don't have this error in my side.