Open 1hitsong opened 1 year ago
Thanks for report the issue! This issue will be fixed in the #41 and make a new release soon ๐ .
Awesome! Thanks for hopping on this so quickly
Looking at the PR for the fix, would this result in the output being option 1 or option 2?
Option 1 [00:51.00]You gotta fight !
or Option 2 [00:51.00]You gotta fight ! [01:29.99]You gotta fight ! [01:48.29]You gotta fight ! [02:31.00]You gotta fight ! [02:41.99]You gotta fight !
The result will be the option 1.
And note that new NuGet package has been released!
The result will be the option 1.
Cool. Just making sure I'm following.
I've seen a lot of sites that use the multiple timestamps in a way that means option 2, but If that's not to the LRC standard, then option 1 makes the most sense.
Guess this issue is fixed ๐
@andy840119 in my opinion this is not the correct fix for the issue.
Implementing Option 2
mentioned above would make the lib adhere to the basic LRC format.
Multiple timestamps on the same line should result in that line appearing on those timestamp. Dropping all except the first timestamp removes quite any chorus lyrics from songs after the first occurrence.
https://en.wikipedia.org/wiki/LRC_(file_format)
Seems there's two usage for .lrc
format
One is for sceeen text:
[00:21.10][00:45.10]Repeating lyrics (e.g. chorus)
Another one is for karaoke format:
[mm:ss.xx] <mm:ss.xx> line 1 word 1 <mm:ss.xx> line 1 word 2 <mm:ss.xx> ... line 1 last word <mm:ss.xx>
But (as i remember) some software might export the .lrc with those format as well:
[mm:ss.xx] [mm:ss.xx] line 1 word 1 [mm:ss.xx] line 1 word 2 [mm:ss.xx] ... line 1 last word [mm:ss.xx]
Also, did you have idea for handling those cases? And will you willing to make the PR for that because currently I'm focusing on another project๐ข Might have no enough time to solve this issue.
https://suwa.pupu.jp/RhythmicaLyrics.html
This is the software I made the .lrc
file before.
I already took a look at the codebase and it likely requires some API changes. I'll try to come up with a PR over the weekend.
I already took a look at the codebase and it likely requires some API changes. I'll try to come up with a PR over the weekend.
I think in this case it should not repeat:
[mm:ss.xx][mm:ss.xx] line 1 word 1 [mm:ss.xx] line 1 word 2 [mm:ss.xx] ... line 1 last word [mm:ss.xx]
Did you have any idea for handle this case? Or can you describe how will you do for this case in order to reduce the review time?
Api change is OK for me, but would be better to explain why ๐๐ผ
This is taking abit longer than expected, but what I did up until now:
TextIndex
since the current structure does not allow multipletimestamps for the same starting index (we are using a dict)Looking at your post above I've got a question: According to wikipedia karaoke format is like this (duration infront of word)
[mm:ss.xx] <mm:ss.xx> line 1 word 1 <mm:ss.xx> line 1 word 2 <mm:ss.xx> ... line 1 last word
But according to your example it can also be this:
[mm:ss.xx] <mm:ss.xx> line 1 word 1 <mm:ss.xx> line 1 word 2 <mm:ss.xx> ... line 1 last word <mm:ss.xx>
What are the timestamps in this case for?
Just wanted to chime it that I had a user whose files contained lines in the format of [ms:ss.xxx]text
, so with three decimals for milliseconds and no space between timestamp and text. They were not parsed correctly with Jellyfin 10.9, and getting rid of the third decimal place fixed the issue.
I'm guessing this is also an issue with this library, and could easily be fixed as part of these improvements :)
https://github.com/karaoke-dev/sample-beatmap/blob/master/v1/%E6%9E%AF%E3%82%8C%E3%81%AA%E3%81%84%E8%8A%B1/lyric.lrc
@Shadowghost this is the sample .lrc
created by RhythmicaLyrics, a tool to create karaoke format .lrc
file.
I' not sure if official .lrc
format allows [mm:ss.xx]
at the end of the text, but you can see the japanese karoake song and last time at the end of the lyric might means it's the time blue area covers the whole lyric.
@Chaphasilor already took care of that by extending the regex.
@andy840119 seems like we need to differentiate between these layouts:
Core LRC: Single timestamp at the beginning
[mm:ss.xx] text
Core LRC: Multiple timestamps at the beginning (resulting in multiple appearances of the line)
[mm:ss.xx][mm:ss.xx][mm:ss.xx] text
Enhanced LRC (A2): Single timestamp at the beginning, followed by others denoted by <...>
- within them is the DURATION, therefore NO need for a timestamp at the end
[mm:ss.xx] <mm:ss.xx> line 1 word 1 <mm:ss.xx> line 1 word 2 <mm:ss.xx> ... line 1 last word
Enhanced LRC (RythmicaLyrics): Single timestamp at the beginning, followed by others denoted by [...]
- within them are ABSOLUTE timestamps, therfore we need an end timestamp to know where the displaying of the last character ends
[mm:ss.xx]line 1 word 1 [mm:ss.xx] line 1 word 2 [mm:ss.xx] ... line 1 last word [mm:ss.xx]
My plan is to convert all of these into the following object:
TextIndex:
public int Index { get; }
public int? Time { get; }
public IndexState State { get; }
IndexState:
Start
End
Always using the full timestamp for time and IndexState Start, for cases 1 and 2 we have no entry with IndexState End, for cases 3 and 4 we have
Logic for knowing which of the above we are dealing with (based on regexing one line a time):
<>
timestamps -> case 3@Shadowghost
Finally I got some times to check what's happening in the RythmicaLyrics
This tool have two way to generate the file:
If press the save button, it can support save as different type of format, but the contents are the same:
[mm:ss.xx]line 1 word 1 [mm:ss.xx] line 1 word 2 [mm:ss.xx] ... line 1 last word [mm:ss.xx]
And the save format match to the export as text2ass with .kar
extension.
The conclusion:
RythmicaLyrics
does not actually support export the .lrc format
๏ผthey only support another file format (.kar)LrcParser
into KarParser
, and make new parser for the Core LRC and Enhanced LRC (A2)
Maybe it's a good start:
LrcParser
)LrcParser
, let it fully matched the Core LRC and Enhanced LRC (A2)
spec.Fine for me, will do so. One last question: is the ruby tag stuff also KOR specific?
There's not much documentation about .kar file format and maybe it's not as popular as .lrc file
I think is's OK to treat it as official spec And ruby tag might not supported in the .lrc file
Maybe it's a good start:
- Create a first PR to create the KarParser(just copy the conten/tests in the
LrcParser
)- Create another PR to adjust the
LrcParser
, let it fully matched theCore LRC and Enhanced LRC (A2)
spec.
Since quite a few other projects are relying on this project to parse LRC (as its name indicates), it makes sense to maintain the functionality of LRC parsing in this lib and move the kar part to a seperate one like .
BTW I was not able to find a PR on this created by @Shadowghost and wondering what's the current status of it. A service I'm using is relying on the above discussed items to be incorporated and would love to help!
hmmmm.... Guess i need to reserve some time on this library.
@Shadowghost are you still interested on fixing this issue? I'll take over this issue if you have no time to fix it ;_;
Sorry, I've been busy over the last weeks, I can push what I have if you want to take over.
Sorry, I've been busy over the last weeks, I can push what I have if you want to take over.
Nice, you can give me the branch and i can cherry-pick if possible to use ๐๐ผ
Great! Being able to handle A2 extensions (word by word timestamps) and support millesecond timestamps [mm:ss.xxx] or \<mm:ss.xxx> in A2 (as mentioned also by @Chaphasilor) is the most desired feature from the downstream projects. Appreciate your help.
@andy840119 there you go: https://github.com/Shadowghost/LrcParser/tree/extend Sadly it is very WIP, lmk if you have questions.
@1hitsong @Shadowghost @Reapor-Yurnero @Chaphasilor
Here's the new release follow the .lrc format in the wiki ๐ https://www.nuget.org/packages/LrcParser
Only few changes and should be easy to adjust https://github.com/karaoke-dev/LrcParser/blob/main/CHANGELOG.md
It's time to give it a try, report feedback is welcome โค๏ธ
@andy840119 thanks for the update, working fine to far on my local tests. I do have one question though: What was the reason for you to not re-use the TimeTags
for multiple occurrences?
What was the reason for you to not re-use the TimeTags for multiple occurrences?
What does that means?
Using TimeTag
class in the LrcLyric
Or Allow have multiple time-tags between lyric text?
I think I misunderstood what TimeTags
is for.
Why I'm mentioning it is this: Before it was possible to get LRC metadata tags like these:
[ti:Fortnight]
[ar:Taylor Swift/Post Malone]
[al:THE TORTURED POETS DEPARTMENT (Explicit)]
[offset:0]
by decoding the LRC (result would have all lines in the Lyrics
object), ckecking the lines for TimeTags.Count == 0
and then parsing their text content. Now these lines are completely omitted and TimeTags.Count
is always 0 if the LRC has timestamps. Only way to get them would be to parse the full LRC text.
We don't actually need them but it would be nice for completeness to have them available somewhere
[ti:Fortnight] [ar:Taylor Swift/Post Malone] [al:THE TORTURED POETS DEPARTMENT (Explicit)] [offset:0]
That's called metadata, and i guess should make another parser for it (e.g. LrcMatadataParser
in the LrcParser/Parser/Lrc/Lines/
)
ckecking the lines for TimeTags.Count == 0 and then parsing their text content.
I'm not sure what it's means (sorry for my bad english understanding), but TimeTags
in the Lyric.cs
still needed if lyric contains the A2 extension formant
e.g.
[00:13.34] <00:14.32> Don't <00:14.73> you <00:15.14> want <00:15.57> somebody <00:16.09> to <00:16.46> love
Lyric's start time will be [00:13.34]
And Lyric's first time-tag will be [00:13.34]
+ <00:14.32>
in 2024.728.1
[update]
Wait, i found a bug in this version.
<mm:ss.xx>
seems should be absolute time, not relative time Orz.
[update] https://www.nuget.org/packages/LrcParser New version released (2024.728.2)
I'm not sure what it's means (sorry for my bad english understanding), but TimeTags in the Lyric.cs still needed if lyric contains the A2 extension formant
Yes, that's what I mean with I misunderstood their usage ๐
That's called metadata, and i guess should make another parser for it (e.g. LrcMatadataParser in the LrcParser/Parser/Lrc/Lines/)
Alternatively you could parse it in the existing parser and add a Metadata
object to the parser result in addition to Lyrics
- would preent us from reading the file two times if we want both.
Personally I don't think it's very necessary to parse metadata in the lyrics. We can show them in the lyrics text but we don't need to parse it. Metadata should in the end be embedded in the music file itself rather than being double embedded in the lyrics.
Personally I don't think it's very necessary to parse metadata in the lyrics. We can show them in the lyrics text but we don't need to parse it.
I think the next step might let LrcParser able to prese metadata, for prevent treating metadata as lyric text. As your mention, song info is able to get from the music file itself, so there's no need to show in the lyric again.
In fact, i think it does have its merit to have some metadata shown in the lyrics such as composer, lyrics writer etc. which are typically not part of the music metadata itself.
We may just skip these none timestamp brackets and do not parse them at all.
Personally I don't think it's very necessary to parse metadata in the lyrics. We can show them in the lyrics text but we don't need to parse it. Metadata should in the end be embedded in the music file itself rather than being double embedded in the lyrics.
As your mention, song info is able to get from the music file itself, so there's no need to show in the lyric again.
Usually lyrics are not shared with the associated media file due to copyrights. Adding the metadata to the music file is therefore not really a valid argument. I'd even argue that lyrics without such metadata are kind of useless if different edits of a song exist.
We may just skip these none timestamp brackets and do not parse them at all.
That is already the case for synced lyrics.
When attempting to parse a lyric file that has the following line, the parser failed.
[00:51.00][01:29.99][01:48.29][02:31.00][02:41.99]You gotta fight !
The expected outcome was for the parser to use the text "You gotta fight !" at each of the five time stamps.