Colorize lines based on prefix

Ded10c commented 6 months ago

Some karaoke software uses a short prefix at the start of a line to determine colorization for songs with multiple singers, which would be nice to have in general but especially useful for large-cast musical numbers. Wikipedia details the example used by Walaoke:

[00:12.00]Line 1 lyrics [00:17.20]F: Line 2 lyrics [00:21.10]M: Line 3 lyrics [00:24.00]Line 4 lyrics [00:28.25]D: Line 5 lyrics [00:29.02]Line 6 lyrics

Let's say we use blue for male, red for female and pink for Duet.

Line 1 will use the default color (blue) when no tag is found.

Line 2 lyrics start with red when F: is found.

Line 3 lyrics start with blue when M: is found.

Line 4 lyrics stays blue when no tag is found.

Line 5 lyrics start with pink when D: is found.

Line 6 lyrics stays pink when no tag is found.

Rather than hardcoding the colous to F, M and D, basing colorization on order of appearance would allow support much more complex arrangements. Some lyrics already insert an identifier using an extra pair of square brackets between the timestamp(s) and start of a part's first line as a more user-readable approach than [a-zA-Z0-9]::

[00:59.00]I'm snappin' off your window lock
[01:02.00]Got no time to knock, I'm a dead girl walking
[01:08.00]M: Veronica? What are you doing in my room?
[01:11.50]F: Shh...
[01:13.00]Sorry, but I really had to wake you

[00:59.00]I'm snappin' off your window lock
[01:02.00]Got no time to knock, I'm a dead girl walking
[01:08.00]J: Veronica? What are you doing in my room?
[01:11.50]V: Shh...
[01:13.00]Sorry, but I really had to wake you

[00:59.00]I'm snappin' off your window lock
[01:02.00]Got no time to knock, I'm a dead girl walking
[01:08.00][J.D.]Veronica? What are you doing in my room?
[01:11.50][Veronica]Shh...
[01:13.00]Sorry, but I really had to wake you

jacquesh commented 6 months ago

Would you envision this being configured globally? How consistent do these tend to be between tracks? I would have assume "not particularly"?

Alternatively, if there's a common standard for how these differing voices get labelled (e.g [Name]) then this could be done automatically to recognise all such lines and just assign random different colours to them (or colours from some configured list if that's a problem).

Ded10c commented 6 months ago

I pulled a sample from my library of concept albums, musical soundtracks and tracks with featuring vocalists, and from most to least common:

Parts aren't identified at all
Parts are identified consistently, with [Name] appearing between timestamp and lyric whenever that part takes over from another
Parts are identified consistently, with [Name] appearing on the preceding line whenever that part takes over from another
Parts are identified consistently, with Name: appearing between the timestamp and lyric whenever that part takes over from another
Where a song has lead and supporting parts, supporting lines are in parentheses
Parts are identified inconsistently because section and part are identified simultaneously (e.g. [Verse 3: Name])
- I didn't find anything using initials.

Thankfully 5) seems relatively uncommon, because I can't imagine it being safe to assume the parentheses are there to identify a subordinate part, and 6) is agravating but blessedly rare. 1) was overwhelmingly most common - easily more than the rest put together - but 2) through 4) were prevalent enough that I think it's probably safe to treat them as de-facto standards.

In terms of what the identifiers are, consistency between tracks is okay provided they're from the same album - consistency between albums is functionally nil. Now I think about it I'd personally be inclined to approach colour assignment on a per-track basis, with the default for the largest part and subsequent colours assigned to parts in descending size order.

EDIT: Now encountered some with (Name) as well, which looks like it might be about as common as 4)

Rexadev commented 6 months ago

https://github.com/marz1877/SyncedLyrics_lrcv2 https://en.wikipedia.org/wiki/LRC_(file_format)#Walaoke_extension:_gender

jacquesh commented 6 months ago

Thankfully 5) seems relatively uncommon, because I can't imagine it being safe to assume the parentheses are there to identify a subordinate part, and 6) is agravating but blessedly rare. 1) was overwhelmingly most common - easily more than the rest put together - but 2) through 4) were prevalent enough that I think it's probably safe to treat them as de-facto standards.

I think option (2) is the most robust. Option (3) means that lines are no longer independent and ordering & formatting becomes rather more complicated. Option (4) would work but falls over as soon as there's a colon in the actual lyrics (which could easily be the case for lines like "and they said: 'quote'").

Realistically I think that probably means supporting option (2) for display, and possibly providing tools in the editor to help create that data or convert it from options 3/4/6.

Now I think about it I'd personally be inclined to approach colour assignment on a per-track basis, with the default for the largest part and subsequent colours assigned to parts in descending size order.

I assume by "colours assigned to parts in descending size order" you mean "lines not tagged with a voice have the usual colours and then lines that do have a voice use the configured colours with the voice that has the most lines using the first configured colours, and then less frequently-used voices using the later colours". Is that accurate? That seems like a sensible plan.

On that note though, how would this interact with the current notion of an "active" colour? Would we only use voice-based colours on the active line? On all lines and ignore the "active-ness" of the line (I assume not)? Have voice colours defined in pairs: Active & inactive, and then use the appropriate one? The last option seems the most sensible (but also the most complicated, predictably) and would neatly slot in with the current "default" and "active" colours being the colour-pair for "unvoiced" lines.

PlaylistsTrance commented 6 months ago

https://github.com/marz1877/SyncedLyrics_lrcv2

Personally, I think method 9 from this would make the most sense. Users specify all colours themselves similar to HTML tags, having a tag for main, highlight, and past. I think this would be the most flexible. I have made many synced lyrics myself with (Name) (or (Name1/Name2) etc.) tags, but these can appear multiple times per line (ad-libs or background vocals). This is why I think HTML-like tags would be so nice cos you'd just do something like turning [02:11.43](Wendy) 복잡한 고민에 갇혀있지 않을래 ((Seulgi) 않을래) into [02:11.43]<main=#1F3386><highlight=#3555E0><past=#1F3386>복잡한 고민에 갇혀있지 않을래 <main=#996A25><highlight=#FFB23F><past=#996A25>(않을래)

Ded10c commented 6 months ago

I think option (2) is the most robust.

Having heard why I concur, particularly regards colons. Lines being independent is a fair point, but the way (2) is most widely used doesn't appear to provide for that - voices are usually declared only when they take over from another, rather than on every line.

I assume by "colours assigned to parts in descending size order" you mean [...] Is that accurate?

Whilst it could be that a set of lyrics are tagged such that a lead voice is untagged and every other line of any other voices is - I couldn't find any examples - I doubt it'd be possible to support both that and a functional implementation of (2) as-is. To do that, we'd need to switch back to the main voice on an untagged line, which precludes lines inheriting the last-used voice as (2) expects. The only other option would be to try to determine whether a set of lyrics is using (2) or only tagging lines in non-primary voices, and I can't think of a reliable way to do that programmatically.

If we assume (2) is being used in the "standard" way (or rather, the most common way I found and the one marz1877 appears to endorse) then any line that doesn't have a voice explicitly declared would be assigned the same as the line previous. Any other untagged lines would be right at the beginning of the song, before the first voice is declared - barring the above that's probably an error, and our safest bet in that case would be to assume they belong to the voice which is most common throughout the rest of the song. At that point all untagged lines are inheriting a voice, from either a previous line or a best guess based on the rest of the song; we can tally the lines assigned per voice, give the biggest part the usual colours, and work down the list from there.

how would this interact with the current notion of an "active" colour?

I'd actually completely forgotten about highlight colour. It's potentially much more invasive, but with the default highlight being white and inactive text being -28 lum relative to that perhaps it'd be worth considering using the defined colour as the highlight and deriving inactive lines from there? Perhaps we could use a similar approach to past lines and provide an option to blend the defined (highlight) colour into the background, perhaps to a configurable degree?

Personally, I think method 9 from this would make the most sense

A lot of lyrics already define parts using names per method 2, so if 9 is implemented I think it'd work best as something that overrides whatever colour we'd otherwise have selected when present - and though I could see some utility in "if the part identifier tag contains a hex colour code...", it's probably better implemented to explicitly require the specified [cr=#000000][/cr] syntax. I have to disagree on cribbing from HTML syntax - square brackets are already an established and widely-used part of the format, whereas the only place I've triangle brackets used so far is for per-word timing in marz1877's specification.

PlaylistsTrance commented 6 months ago

I have to disagree on cribbing from HTML syntax

I don't have a preference for type of brackets. I guess I should have said XML-like syntax, where you just specify a start and end. [cr=#000000][/cr] is fine, but as I mentioned, I do think it would be best if each colour type (main/highlight/past) had its own tag.

jacquesh commented 6 months ago

Whilst it could be that a set of lyrics are tagged such that a lead voice is untagged and every other line of any other voices is - I couldn't find any examples - I doubt it'd be possible to support both that and a functional implementation of (2) as-is.

Oh. I went back and realised I'd totally mis-read your initial comment. I'd understood "option 2" to mean "every single line that has a defined voice, is tagged/prefixed with the name of that voice". Which is simply not what you were saying. Sorry for the confusion.

In that case I'm less keen because that has similar problems to option 3 (ordering matters, lines can no longer be parsed independently). This may well not prove fatal but its a consideration. I can absolutely see how/why your actual option 2 (tagging when the voice changes) is more common though (you can even see it in some unsynced lyrics from places like Genius.com).

jacquesh / foo_openlyrics

Colorize lines based on prefix #351