openai / jukebox

Code for the paper "Jukebox: A Generative Model for Music"
https://openai.com/blog/jukebox/
Other
7.78k stars 1.4k forks source link

How do we edit metas for genre/artist fusions #17

Open Cortexelus opened 4 years ago

Cortexelus commented 4 years ago

To make genre and artist fusions, such as 50% pop 50% jazz, 50% sinatra 50% fitzgerald how should we edit metas in sample.py? Can you show us some examples?

heewooj commented 4 years ago

Yeah, we didn't support this feature, because it is a little messy. The style embedding is given by y_cond here. To interpolate them, you could add these lines and pass appropriate values of another_y and alpha when get_cond is called:

+def get_cond(self, z_conds, y, another_y=None, alpha=0.5):
     ...
     y_cond, y_pos = self.y_emb(y) if self.y_cond else (None, None)
+    if self.y_cond and another_y is not None:
+        assert y.shape[0] == another_y.shape[0], "Label batch size is different."
+        n_labels = another_y.shape[1] - self.n_tokens
+        another_y = another_y[:, :n_labels]
+        another_y_cond, _ = self.y_emb(another_y)
+        y_cond = y_cond * alpha + another_y_cond * (1.0 - alpha)
     x_cond = self.x_emb(z_conds) if self.x_cond else y_pos
     return x_cond, y_cond, prime

To construct another_y:

  1. Choose other_metas like this (it doesn't matter what lyrics you choose here, because they are not used).
  2. other_metas -> other_labels
  3. other_labels -> another_y (start is used to approximately locate the lyric window, and can be set to a dummy value like 0 for the purpose of mixing styles)
Cortexelus commented 4 years ago

Thanks! Would spherical interpolation be a better choice?

heewooj commented 4 years ago

Yeah, it's possible. You're welcome to try other variations

songeater commented 3 years ago

Hi - trying to implement this and got as far as another_y_cond, _ = self.y_emb(another_y) ...when i got the following error... cant really diagnose what's happening here @heewooj ... any thoughts? thanks in advance!

/usr/local/lib/python3.7/dist-packages/jukebox/prior/conditioners.py in forward(self, pos_start, pos_end) 89 # Check if [pos_start,pos_end] in [pos_min, pos_max) 90 assert len(pos_start.shape) == 2, f"Expected shape with 2 dims, got {pos_start.shape}" ---> 91 assert (self.pos_min <= pos_start).all() and (pos_start < self.pos_max).all(), f"Range is [{self.pos_min},{self.pos_max}), got {pos_start}" 92 pos_start = pos_start.float() 93 if pos_end is not None:

AssertionError: Range is [1049580.0,26460000.0), got tensor([[1048576.], [1048576.], [1048576.]], device='cuda:0')

songeater commented 3 years ago

ok think i have a solution... y_emb() was calling the positional embedding from the forward() function on the underlying LabelConditioner class was being called... we dont need that for another_y

y_emb_time_signal = self.y_emb.include_time_signal
self.y_emb.include_time_signal = False
another_y_cond, _ = self.y_emb(another_y)
self.y_emb.include_time_signal = y_emb_time_signal