UltraStar-Deluxe / format

UltraStar Format Specification
https://usdx.eu/format
MIT License
11 stars 2 forks source link

[Spec] Specify a preferred line ending #43

Closed Tuupertunut closed 6 months ago

Tuupertunut commented 9 months ago

Suggestion

Specify in the Ultrastar format specification whether files should use Unix (LF, \n) or Windows (CRLF, \r\n) line endings.

Use case

The Ultrastar format specification does not specify which line endings a file should use. There are currently tools in the Ultrastar ecosystem that accept or produce only one type of line endings, making them incompatible.

Examples:

Therefore songs created with Ultrastar Play song editor (at least on Linux) cannot be uploaded to USDB because of this.

Extra info/examples/attachments

No response

Baklap4 commented 9 months ago

id love to unify this. But it will be hard to match this up for everything LF would be my preferred way even though im a windows user. As this will work correctly in latest os'ses

bohning commented 9 months ago

I might be wrong, but I think all except usdb can handle both types of line endings. So ideally usdb would be changed to accept both as well, so users can have their native OS line endings and everything works fine and nobody has to worry about line endings. But of course it doesn’t hurt to specify a standard. If LF works for recent windows, then I guess that would be the candidate for the standard, but then all users would have to manually convert to CRLF before adding new songs to usdb.

kamischi commented 9 months ago

my 2 cents:

codello commented 8 months ago

I think in terms of compatibility and convenience it would make sense to allow both line endings and maybe for the sake of robustness also a single \r. I also think it would make sense to issue a recommendation that text files should use \n as line separator for consistency.

Requiring a certain line separator (as in songs that use a different one would not be loaded into a game) would probably lead to a lot of confusion for less technical players.

Baklap4 commented 8 months ago

Hmm from a users perspective i get it that you'd want to have both line-endings.

Most users though do download their files from a server (eg: usdb) and then it should just work out of the box. If the files hosted on usdb always serves lf (or whatever line-ending) and if every creator tool is creating the txt with lf then it's just a matter of time when all of the files are consistent.

txt-hosters should do a conversion to the unified format upon upload so they spread it correctly and that's a pretty minor task

As for line-seperators we do have the - character for the games to see that they should put a line-break to split up the sentences.

codello commented 8 months ago

To me this is kind of a fundamental decision: If we intend for files to be edited by humans, in particular non-technical users, a specific requirement on the line ending would be very detrimental to the experience. If the intention is that the format is used mainly by applications (editors, txt-hosters, etc.) and developers then a defined line ending would probably be advantageous for compatibility (and would simplify the spec a little).

Relying on txt-hosters to do a conversion feels like an invitation to disregard the spec and just do a conversion. I'd much prefer if txt-hosters could in good conscience reject non-compliant txt files.

Baklap4 commented 8 months ago

In some aspect the files can be edited by humans, however this is probably a niche audience. Many endusers just want to sing along and have a fun karaoke night.

Users who go digging into the txt-file itself usually do that through tools like Yass, Creator and other applications. Not by opening notepad and do the changes there, as there's no feedback in if the change worked or not which the other tools do give

I agree with the hosters part, but we don't want to bore creators with details to fix it themselves. They've got better things to do: creating songs ;) As a txt-hosting party it's a fairly easy task to do the conversion if necessairy if all is well it shouldn't have to convert, but if something is off he can rectify it.

RumovZ commented 8 months ago

I don't think it's that rare for users to open a song txt with some editor, and be it just out of curiosity. Let's not speak about the things those programs can then do to an innocent txt! Even Git for Windows has a mess up all my line endings option which is enabled by default!

It's such an easy thing to parse all kinds of line endings without ambiguity. On the other hand, having to deal with line endings manually can be very frustrating. So I strongly vote for recommending LF, but not requiring it exclusively.

codello commented 7 months ago

To me it seems the options seem to be clear so far:

  1. Require line ending to be \n. \r might be recognized as a whitespace character (#46). \r\n would consequently be a whitespace followed by a line ending.
  2. Allow both \n and \r\n as line endings but recommend that only a single \n is used. A single \r could be recognized as whitespace (#46).
  3. Allow all three \n, \r\n, and \r to be recognized as line endings but recommend that only a single \n is used.

Are we ready to vote on this?

marwin89 commented 7 months ago

SUMMARY / VOTE NOW πŸ“ˆ βœ‹

Thanks for the discussion and thanks for the summary @codello. We are close to a final result.

Please vote for this issue till 29th feb 2024 with emoticon πŸŽ‰ / πŸš€ / ❀️ .

Options

πŸš€ A) Require line ending to be \n. \r might be recognized as a whitespace character (#46). \r\n would consequently be a whitespace followed by a line ending. πŸŽ‰ B) Allow both \n and \r\n as line endings but recommend that only a single \n is used. A single \r could be recognized as whitespace (#46). ❀️ C) Allow all three \n, \r\n, and \r to be recognized as line endings but recommend that only a single \n is used.

Tuupertunut commented 7 months ago

Just wanted to point out that the last time \r has been used as a line ending was before 2001 in Mac OS 9, so way before the original Ultrastar was first released. Nowadays it's only \n or \r\n that are used anywhere.

codello commented 7 months ago

Yeah that's true. I don't expect it to make any real difference in practice. The only reason I prefer C over B is that many programming languages recognize all three patterns by default (e.g. Java's BufferedReader, Python's open, C#'s StreamReader, Pascal's TStreamReader) so I expect option C to be easier to implement for most people.

There are of course also examples on the opposite side, e.g. Go's bufio package by default only recognizes \n and \r\n. But since existing projects such as USDX, Vocaluxe and USBD_Syncer use the languages above I think option C will be easier to implement in existing projects.

codello commented 6 months ago

It seems to me there is a majority for option C. I have created a PR (#51) to add this result to the spec.

marwin89 commented 6 months ago

Yes, Result is option C:

Allow all three \n, \r\n, and \r to be recognized as line endings but recommend that only a single \n is used.

Thanks to everyone for contribution. I merged the Pull Request. βœ