If we have multiple instruments in a song, as listeners, we pay attention to each one to varying degrees. We can quantify value using descriptors like rhythmic complexity, melody smoothness, volume, and mode adherence.
This could be a controllable attribute of portions of a track. Often, a different instrument will take over as the focus, to create interest in the song. Also interesting, is how the interest is divided. Outros are probably more spread out, solos will center mostly around one instrument. Additionally, the total amount of "interest" of a song could be a general way to measure tension / release.
It gets complicated with lyrics, as those trigger a different listening response: language interpretation. By default, a lyrical melody is going to attract more attention than a more complex "instrument" melody. This is probably partly due to it being a human voice (which people inherently feel differently about than other instruments) and partly due to the effort required for processing the words and what they mean. We can probably account for the second one by looking at how often words repeat. You could take it further and look at how common the words being said are ("guy" and "hey" should weight less on complexity than "astronomical" and "keyboard", as they take less work for the listener to process.
If we have multiple instruments in a song, as listeners, we pay attention to each one to varying degrees. We can quantify value using descriptors like rhythmic complexity, melody smoothness, volume, and mode adherence.
This could be a controllable attribute of portions of a track. Often, a different instrument will take over as the focus, to create interest in the song. Also interesting, is how the interest is divided. Outros are probably more spread out, solos will center mostly around one instrument. Additionally, the total amount of "interest" of a song could be a general way to measure tension / release.
It gets complicated with lyrics, as those trigger a different listening response: language interpretation. By default, a lyrical melody is going to attract more attention than a more complex "instrument" melody. This is probably partly due to it being a human voice (which people inherently feel differently about than other instruments) and partly due to the effort required for processing the words and what they mean. We can probably account for the second one by looking at how often words repeat. You could take it further and look at how common the words being said are ("guy" and "hey" should weight less on complexity than "astronomical" and "keyboard", as they take less work for the listener to process.