robbiebarrat / rapping-neural-network

Rap song writing recurrent neural network trained on Kanye West's entire discography
1.04k stars 167 forks source link

rhmyeindex() and rhyme() questions #17

Closed myou11 closed 6 years ago

myou11 commented 6 years ago

As per @robbiebarrat's request, I am posting some questions I have about the Rapping Neural Network code in model.py so that others who may have the same questions in the future can see this.

Original Email: Hello Robbie,

I am analyzing your code for your Rapping Neural Network project and trying to figure out what each part of your program does. Currently, I am stuck on the rhymeindex() and rhyme() functions in model.py. Near the end of the rhymeindex() function, you reverse the 2-letter endings in rhyme_master_list and sort them, then reverse them. However, when I print the rhyme_master_list on line 75 and the reverselist on line 77, reverselist is not the reversed 2-letter endings from rhyme_master_list and I'm not sure why that is. Overall on this last part, would you be able to explain the reasoning behind having the reverselist and what you are meaning to do here?

For the rhyme() function, I am confused as to what the purpose of the float_rhyme is. I see it is being used in build_dataset() to construct the 2x2 input for the network. But how does the LSTM use this to help construct a rap?

If you have time to explain these to me, I would be very grateful.

robbiebarrat commented 6 years ago

Hey Maxwell - thanks for posting your email here. If other people have the same questions; i'd like for them to be able to see it as well.

The idea behind reversing and then sorting is pretty simple. Lets say our rhymes are

['ed', 'ag', it', 'ad']

Note that 'ed' and 'ad' are similar rhymes (e.g. words that end in 'ed', like 'batted'. for example, will often rhyme with words that end in 'ad' - like 'ballad', for example)

just sorting those; we'd get:

['ad', 'ag', 'ed', 'it']

As you can see - the similar rhymes, 'ad' and 'ed', don't end up near each other. If the network messes up a little bit and doesn't return quite the right values, it can often lead to picking the rhyme ending next to the expected one in the list - so it's prediction is close but spot on.

If we reverse the rhymes, and then sort them, we'd get

['de', 'da', 'ga', 'ti']

['ed', 'ad', 'ag', 'it']

This way, 'ed' and 'ad' are close together, so if the network messes up a little bit in its prediction, it's alright because as long as it's just close to the correct rhyme, it'll still pick words that are close rhymes.

As for your printing issues; are you sure you're printing the right things? Keep note that on line 80, reverselist becomes rhymelist.

The 2x2 input for the rap is an array consisting of syllables and rhyme scheme (the index in the rhymelist of the ending for that line, so if our list is ['om', 'ed', 'ng'], and our line is "scanning the room" - there are 4 syllables and it rhymes with 'om' - so the array for that line would be [0.25, 0.333] - where 0.3333 is the index of 'om' in the rhymelist (1) divided by the length of the rhymelist (3) - and 0.25 is the number of syllables (4) divided by maxsyllables (16 - defined at the top of the program).

The input for the network is an array containing this for two lines, so if we feed it "scanning the room", "im listening to MF doom", the array will be

[[ 0.25, 0.333] [ 0.5, 0.333 ]]

The network just predicts series of these arrays, and then maps them to the most similar lines generated by the markov chain (which only does word frequencies and builds a sort of corpus of generated individual lines for the network to order).

Let me know if I missed anything; but I hope that starts to answer your questions.

myou11 commented 6 years ago

Thanks for the explanation Robbie! I understand those parts a lot better now.

I have also looked through the rest of the code and am piecing together the big picture. Just to make sure; in train(), you use the (syllables, rhyme scheme) pairs to train the network. Then, when the network is trained, in compose_rap(), you have it predict some (syllables, rhyme scheme) pairs (which are also the number of lines that will be in the final rap (x2)). Each of these predictions are then scored against each of the lines in the generated lyrics to find the line that best matches the (syllables, rhyme scheme) predicted by the model. This is the line that is used in the final rap, which is also printed to the console.

There's a lot more going on besides that, but I think that is the basic premise?

robbiebarrat commented 6 years ago

Yeah you're pretty much right - I know it could be a lot better, but it's just sort of the system that I came up with while writing it.

Also by the way; are you doing a project or something related to this repo?

myou11 commented 6 years ago

Yeah, I am doing a project with another team member and we are trying to learn about RNNs and LSTMs and wanted to find something that wasn't as mainstream as handwriting recognition. It's pretty much an analysis of your code and how everything works together, and we are also using Jay-Z lyrics instead. We've also given credit to your repo as we are just doing an analysis.

robbiebarrat commented 6 years ago

Oh nice - yeah feel free to ask again if you have any questions, and let me know if the Jay-Z lyrics turn out well!

If you come up with anything that could be used as documentation for the repo; let me know too, because ATM it's not very well documented

myou11 commented 6 years ago

The Jay Z lyrics turned out pretty well and thanks again for the help! It's funny seeing what the program generates.

As for the documentation, we wrote comments for all of the functions in your code and I can send you those or push up a PR if you want! I'm not sure how correct we were in some of our descriptions of your functions though. So I don't want to add incorrect information without you reviewing it first.

Let me know what you want to do though!

robbiebarrat commented 6 years ago

If you could just gmail it to me, that'd be super nice. I'll make the needed corrections (if any) and add them to the code (don't worry - I'll give you the appropriate credit in the code and also in the readme)

IMO it's just a lot easier than doing the whole "review/correct pull-request" thing; but if you really want to do a PR as opposed to gmail just do that.

myou11 commented 6 years ago

No problem, I think that would be a lot easier as well!

bensdm commented 6 years ago

i am actually interested by your documentation too, is it possible to have it? thanks a lot

robbiebarrat commented 6 years ago

@bensdm i'll put a link to it in the readme - give me a sec

robbiebarrat commented 6 years ago

@bensdm look right below the "usage" section in the readme - documented_model.py links to what @myou11 sent me.