Closed loisaidasam closed 7 years ago
Thanks for PR. What would be it good about this change? I think using delimiters are more common. We can then see what happens when the network meet them. Also the code is written for that.
On 6 Feb 2017, at 20:01, Loisaida Sam notifications@github.com wrote:
instead of using START / END delimiters
You can view, comment on, or merge this pull request online at:
https://github.com/keunwoochoi/lstm_real_book/pull/1
Commit Summary
Chord sentences, one per line File Changes
M chord_sentences.txt (2848) Patch Links:
https://github.com/keunwoochoi/lstm_real_book/pull/1.patch https://github.com/keunwoochoi/lstm_real_book/pull/1.diff — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
Hi Keunwoo,
First off, this is amazing and quite inspiring. Thank you for your hard work.
My initial thoughts were that it'd be helpful for (human) readability, but I see what you mean, that the RNNs actually parse the delimiters as inputs.
I was thinking about other analyses that could be done on these sentences, perhaps by using vector representations of the chords that each sentence contains to do some sort of similarity comparison between sentences themselves.
Would it be possible to get a list of the song titles represented by each "sentence" of chords?
@keunwoochoi
Here's what a CSV of songs w/ titles and associated chords might look like:
https://github.com/loisaidasam/lstm_real_book/commit/695bc6d8e9121f6eda8bb670dd7bd62c28183df0
Thanks! this folder has the original .lab files, and I think they are just alphabetical order in the merged text file. Vector representation sounds fun. I can't quickly think of good representation that preserve the information in the chord - root and chord - though. Hm.. hm..
Hey @keunwoochoi,
To confirm the ordering, I looked at the .lab
files in /more_data_to_play_with/jazz_xlab.zip
to try and associate them with the chord "sentences" in chord_sentences.txt
. To do this, I wrote this converter.
That worked just fine, but it seems that all of the files included in the repo are in their original keys, not transposed to C. I don't have any easy way of transposing them (I don't have Band-in-a-box
or anything). I tried a few naive things based on my knowledge in music theory, but it gets more complicated when tunes introduce more sophisticated chords, which often happens in jazz. For example, see the Am7
chord on the fifth line of Misty, a tune in the key of Eb
. The key of Eb
doesn't have an A
, it has an Ab
. This Am7
chord actually serves as the #11
voicing, but how do you express those relationships programatically? I'm assuming there's a good chart somewhere or something. Perhaps a project for another day...
Anyway, what did you do to transpose the tunes to C? Did you use Band-in-a-box
? Do you have the transposed .lab
files saved anywhere? or would you be willing to generate them again?
In the mean time, I wrote some scripts to do a little automated regex matching to try and match the patterns of chord types against the results in chord_sentences.txt
, excluding the actual notes of the chords (something like ([^\s:]+\:maj7)\s([^\s:]+\:maj7)\s([^\s:]+\:min7)\s([^\s:]+\:min7)\s ...
), and found that they are ALMOST in alphabetical order, but with a bunch of random items out of order. For example, when listed in alphabetical order, the second listed file 'deed I Do.Mgu.xlab
is actually the 611th
sentence in chord_sentences.txt
.
My writing is getting a bit egregious for a closed pull request, so I'm going to open an issue if you don't mind, and we can continue the discussion there :)
-Sam
PS. Re: the vector representation, I have a few ideas. To get started, I was thinking something along the lines of creating custom embeddings based on the n-grams contained in each "sentence" of chords. I will elaborate on that a bit later once some more of the data wrangling is done.
PS. I have been/will be extremely busy for about a month, so please excuse me if I'm not too responsive!
instead of using
__START__
/__END__
delimiters