What is "reference" element when generate step on training?

sakemin commented 11 months ago

Hello, I'm currently fine tuning musicgen with dora run command. I set generate.every = 1 so that there's a generation process every epoch.

The output files of generation process are composed with one JSON file and one WAV file.

In JSON file, there's "reference" element in it, like below.

"reference": {
    "id": "3ce288e9d7c8e658c9004067ac98f1de970d7dd9",
    "path": "/mnt/nvme/tmp/audiocraft_sake/xps/454f0a30/samples/reference/3ce288e9d7c8e658c9004067ac98f1de970d7dd9.wav",
    "duration": 30.0
  },

Seems like the output WAV file has similar beginning with the ref WAV file, but it seems like not giving this ref file as prompt (because I set generate.lm.prompted_samples = False, so in JSON file it is "prompt": null). What is this idea of 'reference' and what does it do?

Thank you

adefossez commented 11 months ago

i think it is the wav file in the dataset that matches the description used to generate the sample, even without prompting.

jbmaxwell commented 11 months ago

Ah, interesting. That makes sense; I suppose they have to get the generate descriptions from somewhere and this is the easiest way to do that.

facebookresearch / audiocraft

What is "reference" element when generate step on training? #272