mtpearce / idyom

http://mtpearce.github.io/idyom/
GNU General Public License v3.0
46 stars 9 forks source link

Articulations and dynamics in kern format #44

Closed adityac95 closed 2 years ago

adityac95 commented 2 years ago

I'm trying to get IDyOM to model harmonic structure. I know it's only designed for monophonic input right now, so in the past I've just been converting chord labels to an alphabet of integers, making .txt files of the pieces, and training IDyOM on those. That's been fine but it only has single viewpoint functionality. I recently came up with a representation of chords in my corpus of interest that I can essentially map to a purely monophonic input by leveraging some of the non-pitched viewpoints (dynamics, ornaments and articulation) in order to get the multiple viewpoints functionality working for me.

The corpus is currently in a tabular format and I'm going to convert it to a number of humdrum .krn files. The humdrum format has functionality for all of the viewpoints I listed above. However, I'm not sure of two things. First, I wanted to double check which humdrum symbols match each of the articulations that IDyOM can understand. For ease of reference, here they are:

articulation: 0 = no articulation mark; 1 = staccato; 2 = staccatissimo; 3 = sforzando; 4 = marcato

I'm particularly unsure about the sforzando since it's often treated as a dynamic marking.

Secondly, will IDyOM definitely know how to interpret a separate dynamics **dynam spine if it's in the .krn file? Or should I be encoding them in some other way to guarantee that they're read?

mtpearce commented 2 years ago

Firstly, I'm glad to hear that you've been successfully using IDyOM for modelling harmony. Your approach of mapping chords to integers is essentially what we have done in previous research.[1][2] However, genuine multiple viewpoint representation of harmony is in the pipeline.

Second, the kern import module doesn't yet recognise dynamics, ornaments or articulation, so this is not the route to go down. I'd suggest creating .lisp files, which can be imported as follows:

(idyom-db:import-data :lisp ...)

The format should be clear from the following example but let me know if you have any questions:

("185 chorale melodies harmonised by J.S. Bach." 96 60  ("chor-001"   ((:ONSET 72) (:DELTAST 0) (:BIOI 72) (:DUR 24) (:CPITCH 73) (:MPITCH 42)    (:ACCIDENTAL 1) (:KEYSIG 3) (:MODE 0) (:BARLENGTH 96) (:PULSES 4)    (:PHRASE 1) (:VOICE 1) (:ORNAMENT NIL) (:COMMA NIL) (:VERTINT12 NIL)    (:ARTICULATION NIL) (:DYN NIL))   ((:ONSET 96) (:DELTAST 0) (:BIOI 24) (:DUR 12) (:CPITCH 73) (:MPITCH 42)    (:ACCIDENTAL 1) (:KEYSIG 3) (:MODE 0) (:BARLENGTH 96) (:PULSES 4)    (:PHRASE 0) (:VOICE 1) (:ORNAMENT NIL) (:COMMA NIL) (:VERTINT12 NIL)    (:ARTICULATION NIL) (:DYN NIL))   ...   ((:ONSET 960) (:DELTAST 0) (:BIOI 24) (:DUR 72) (:CPITCH 69) (:MPITCH 40)    (:ACCIDENTAL 0) (:KEYSIG 3) (:MODE 0) (:BARLENGTH 96) (:PULSES 4)    (:PHRASE -1) (:VOICE 1) (:ORNAMENT NIL) (:COMMA NIL) (:VERTINT12 NIL)    (:ARTICULATION NIL) (:DYN NIL)))  ...  ("chor-185"   ((:ONSET 72) (:DELTAST 0) (:BIOI 72) (:DUR 24) (:CPITCH 65) (:MPITCH 38)    (:ACCIDENTAL 0) (:KEYSIG -1) (:MODE 0) (:BARLENGTH 96) (:PULSES 4)    (:PHRASE 1) (:VOICE 1) (:ORNAMENT NIL) (:COMMA NIL) (:VERTINT12 NIL)    (:ARTICULATION NIL) (:DYN NIL))   ...   ((:ONSET 816) (:DELTAST 0) (:BIOI 24) (:DUR 24) (:CPITCH 65) (:MPITCH 38)    (:ACCIDENTAL 0) (:KEYSIG -1) (:MODE 0) (:BARLENGTH 96) (:PULSES 4)    (:PHRASE -1) (:VOICE 1) (:ORNAMENT NIL) (:COMMA NIL) (:VERTINT12 NIL)    (:ARTICULATION NIL) (:DYN NIL))))

[1] Sears, D., Pearce, M. T., Caplin, W. E., & McAdams, S. (2018). Simulating melodic and harmonic expectations for tonal cadences using probabilistic models http://webprojects.eecs.qmul.ac.uk/marcusp/papers/SearsEtAl2018.pdf. /Journal of New Music Research/, 47, 29-52. https://doi.org/10.1080/09298215.2017.1367010 [2] Cheung, V., Harrison, P. M. C., Meyer, L., Pearce, M. T., Haynes, J-D, & Koelsch, S. (2019). Uncertainty and surprise jointly predict musical pleasure and amygdala, hippocampus, and auditory cortex activity. http://webprojects.eecs.qmul.ac.uk/marcusp/papers/CheungEtAl2019.pdf /Current Biology/, 29(23), 4084-4092.e4. https://doi.org/10.1016/j.cub.2019.09.067

On 04/06/2022 20:52, adityac95 wrote:

I'm trying to get IDyOM to model harmonic structure. I know it's only designed for monophonic input right now, so in the past I've just been converting chord labels to an alphabet of integers, making .txt files of the pieces, and training IDyOM on those. That's been fine but it only has single viewpoint functionality. I recently came up with a representation of chords in my corpus of interest that I can essentially map to a purely monophonic input by leveraging some of the non-pitched viewpoints (dynamics, ornaments and articulation) in order to get the multiple viewpoints functionality working for me.

The corpus is currently in a tabular format and I'm going to convert it to a number of humdrum .krn files. The format has functionality for all of the viewpoints I listed above. However, I'm not sure of two things. First, I wanted to double check which humdrum symbols match each of the articulations that IDyOM can understand. For ease of reference, here they are:

articulation: 0 = no articulation mark; 1 = staccato; 2 =
staccatissimo; 3 = sforzando; 4 = marcato

I'm particularly unsure about the sforzando since it's often treated as a dynamic marking.

Secondly, will IDyOM definitely know how to interpret a separate dynamics **dynam spine if it's in the .krn file? Or should I be encoding them in some other way to guarantee that they're read?

— Reply to this email directly, view it on GitHub https://github.com/mtpearce/idyom/issues/44, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZS6JZW5CH6V7TP6NAJ6JLVNOXZHANCNFSM5X35CAJQ. You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- School of Electronic Engineering and Computer Science Queen Mary University of London, London E1 4NS, UK Web: http://webprojects.eecs.qmul.ac.uk/marcusp/ Lab: http://music-cognition.eecs.qmul.ac.uk/

adityac95 commented 2 years ago

That is incredibly helpful, thank you! To confirm, do the 96 and 60 represent the default semibreve duration in ticks and the MIDI number corresponding to middle C?

mtpearce commented 2 years ago

Yes, exactly!

On 06/06/2022 18:16, adityac95 wrote:

That is incredibly helpful, thank you! To confirm, do the 96 and 60 represent the default semibreve duration in ticks and the MIDI number corresponding to middle C?

— Reply to this email directly, view it on GitHub https://github.com/mtpearce/idyom/issues/44#issuecomment-1147686559, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZS6J34TDHYJJGWRG5PYS3VNYW7HANCNFSM5X35CAJQ. You are receiving this because you commented.Message ID: @.***>

-- School of Electronic Engineering and Computer Science Queen Mary University of London, London E1 4NS, UK Web: http://webprojects.eecs.qmul.ac.uk/marcusp/ Lab: http://music-cognition.eecs.qmul.ac.uk/

adityac95 commented 2 years ago

Thanks for all your assistance so far. I'm having some issues when it comes to training models, though – I'm getting an error that I've never seen before.

I have two datasets that I loaded into my IDyOM database, 1 and 2, with all the harmonic features I need mapped to cpitch, dyn, ornament, articulation and dur. The features onset, bioi, barlength, pulses, phrase and voice are all set to the values appropriate to the compositions in the datasets, and I've set all other features to NIL. I called this command:

(idyom:idyom 2 '(cpitch dyn ornament articulation dur) '(cpitch dyn ornament articulation dur) :pretraining-ids '(1) :k 1 :models :ltm :output-path "/Users/aditya/Documents/Yale/datasets/corpus_playground/lisp_rep_idyom_bach_results/" :separator ",")

and I'm getting an error Argument X is not a NUMBER: NIL. Here's the full backtrace:

image

So it seems to have not worked on the very first composition of the dataset.

Do you have any idea why this might be happening? None of the viewpoints that I'm trying to predict have a value of NIL for any of the files in the dataset, so I'm puzzled as to why this is happening. I can send the files for datasets 1 and 2 in case you would like to try replicating the error on your end.

adityac95 commented 2 years ago

For what it's worth, the same error persists when I don't pretrain on dataset 1, as well as when I try to build a model on dataset 1 on its own without pretraining on anything.

mtpearce commented 2 years ago

This is an error when printing out the results file. Without further investigation, there are a few options but I suspect that there is a null value in a target viewpoint relating to time (:dur :bioi :deltast :onset). You mention :onset :dur and :bioi but did you also set :deltast to a non-null value? If not, try setting it to 0 universally in your datasets and see if the error goes away.

On 08/06/2022 19:54, Aditya Chander wrote:

Thanks for all your assistance so far. I'm having some issues when it comes to training models, though – I'm getting an error that I've never seen before.

I have two datasets that I loaded into my IDyOM database, 1 and 2, with all the harmonic features I need mapped to |cpitch|, |dyn|, |ornament|, |articulation| and |dur|. The features |onset|, |bioi|, |barlength|, |pulses|, |phrase| and |voice| are all set to the values appropriate to the dataset, and I've set all other features to NIL. I called this command:

|(idyom:idyom 2 '(cpitch dyn ornament articulation dur) '(cpitch dyn ornament articulation dur) :pretraining-ids '(1) :k 1 :models :ltm :output-path "/Users/aditya/Documents/Yale/datasets/corpus_playground/lisp_rep_idyom_bach_results/" :separator ",")|

and I'm getting an error |Argument X is not a NUMBER: NIL|. Here's the full backtrace:

image https://user-images.githubusercontent.com/61057258/172694129-cff152fe-afff-4d5c-9822-09c437ad5858.png

Do you have any idea why this might be happening? None of the viewpoints that I'm trying to predict have a value of NIL for any of the files in the dataset, so I'm puzzled as to why this is happening. I can send the files for datasets 1 and 2 in case you would like to try replicating the error on your end.

— Reply to this email directly, view it on GitHub https://github.com/mtpearce/idyom/issues/44#issuecomment-1150278483, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZS6JZQIOJLC5WTIYB6WFDVODT6JANCNFSM5X35CAJQ. You are receiving this because you commented.Message ID: @.***>

-- School of Electronic Engineering and Computer Science Queen Mary University of London, London E1 4NS, UK Web: http://webprojects.eecs.qmul.ac.uk/marcusp/ Lab: http://music-cognition.eecs.qmul.ac.uk/

mtpearce commented 2 years ago

Hi, I just wanted to check if that solved the problem for you.

On 10/06/2022 15:47, Marcus Pearce wrote:

This is an error when printing out the results file. Without further investigation, there are a few options but I suspect that there is a null value in a target viewpoint relating to time (:dur :bioi :deltast :onset). You mention :onset :dur and :bioi but did you also set :deltast to a non-null value? If not, try setting it to 0 universally in your datasets and see if the error goes away.

On 08/06/2022 19:54, Aditya Chander wrote:

Thanks for all your assistance so far. I'm having some issues when it comes to training models, though – I'm getting an error that I've never seen before.

I have two datasets that I loaded into my IDyOM database, 1 and 2, with all the harmonic features I need mapped to |cpitch|, |dyn|, |ornament|, |articulation| and |dur|. The features |onset|, |bioi|, |barlength|, |pulses|, |phrase| and |voice| are all set to the values appropriate to the dataset, and I've set all other features to NIL. I called this command:

|(idyom:idyom 2 '(cpitch dyn ornament articulation dur) '(cpitch dyn ornament articulation dur) :pretraining-ids '(1) :k 1 :models :ltm :output-path "/Users/aditya/Documents/Yale/datasets/corpus_playground/lisp_rep_idyom_bach_results/" :separator ",")|

and I'm getting an error |Argument X is not a NUMBER: NIL|. Here's the full backtrace:

image https://user-images.githubusercontent.com/61057258/172694129-cff152fe-afff-4d5c-9822-09c437ad5858.png

Do you have any idea why this might be happening? None of the viewpoints that I'm trying to predict have a value of NIL for any of the files in the dataset, so I'm puzzled as to why this is happening. I can send the files for datasets 1 and 2 in case you would like to try replicating the error on your end.

— Reply to this email directly, view it on GitHub https://github.com/mtpearce/idyom/issues/44#issuecomment-1150278483, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZS6JZQIOJLC5WTIYB6WFDVODT6JANCNFSM5X35CAJQ. You are receiving this because you commented.Message ID: @.***>

-- School of Electronic Engineering and Computer Science Queen Mary University of London, London E1 4NS, UK Web:http://webprojects.eecs.qmul.ac.uk/marcusp/ Lab:http://music-cognition.eecs.qmul.ac.uk/

-- School of Electronic Engineering and Computer Science Queen Mary University of London, London E1 4NS, UK Web: http://webprojects.eecs.qmul.ac.uk/marcusp/ Lab: http://music-cognition.eecs.qmul.ac.uk/

adityac95 commented 2 years ago

Hi, sorry I thought I'd replied -- yes this fixed everything! Thanks for all your help. I forgot how much longer it takes to train models with multiple linked viewpoints (I have a dataset with around 18000 phrases that I'm modelling with pretraining on another dataset containing 1600 phrases, so it's been running for a few days), but hopefully it'll finish running soon...!

mtpearce commented 2 years ago

Thanks for confirming, there will be a fix for this in the next IDyOM release.

Linked viewpoints for modelling harmony can indeed produce very large alphabet sizes for computing distributions!

On 17/06/2022 09:50, Aditya Chander wrote:

Hi, sorry I thought I'd replied -- yes this fixed everything! Thanks for all your help. I forgot how much longer it takes to train models with multiple linked viewpoints (I have a dataset with around 18000 phrases that I'm modelling with pretraining on another dataset containing 1600 phrases, so it's been running for a few days), but hopefully it'll finish running soon...!

— Reply to this email directly, view it on GitHub https://github.com/mtpearce/idyom/issues/44#issuecomment-1158649871, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZS6J2TRS7TPI4CMCIK6MDVPQ34ZANCNFSM5X35CAJQ. You are receiving this because you commented.Message ID: @.***>

-- School of Electronic Engineering and Computer Science Queen Mary University of London, London E1 4NS, UK Web: http://webprojects.eecs.qmul.ac.uk/marcusp/ Lab: http://music-cognition.eecs.qmul.ac.uk/