LeopoldArkham / Molten

[WIP] Molten - Style-preserving TOML parser.
Apache License 2.0
40 stars 8 forks source link

newlines not reproduced correctly #32

Closed ehuss closed 6 years ago

ehuss commented 6 years ago

When serializing, it currently assumes that Windows is CRLF. However, I never use that style of newlines on Windows, and thus the tests are currently failing. It should instead capture the newline style while parsing, and ignore the current platform.

jeremydavis519 commented 6 years ago

I'm pretty sure this is fixed now. I haven't tested it in Windows, but I've seen that the newline string (::NL) is unconditionally set to "\n", whereas it used to be conditional.

ehuss commented 6 years ago

NL is conditional based on platform in lib.rs.

jeremydavis519 commented 6 years ago

It was, but unless I'm really mistaken, that condition has been removed. It's now assigned like this in the master branch, with no condition in sight (and, in particular, no #[cfg(windows)]):

/// Line terminator constant.
pub const NL: &str = "\n";
LeopoldArkham commented 6 years ago

I put the conditional definitions of newlines back in a recent commit;

@ehuss, I think most people conform to their OS default, so I don't think this is worth the extra complexity right now. I think you can use .gitattributes to have git use the kind of newlines you want in this repository though.

jeremydavis519 commented 6 years ago

@LeopoldArkham As I understand it, this isn't actually an issue of conforming to one's OS default. The tests require that each output file be exactly the same as the input file, including whether it uses \n or \r\n for any given line break. We're currently using string slices from the original file to represent the strings, so that

[a]
[b]

is represented differently internally as either [a]\n[b] or [a]\r\n[b], depending on how it is stored in the input file. And if we want to perfectly preserve the format that the file's author used, we need to preserve the input file's newline style, regardless of the OS default. And that is indeed what the code does currently, but the tests will fail in Windows if using ::NL means that a carriage return should be added even if it wasn't in the input file.

(So, in the example above, if the file is stored as [a]\n[b], then checking for format!("[a]{}[b]", ::NL) means checking for [a]\r\n[b] in Windows, which isn't the same as the input. And the same issue occurs if the file is stored as [a]\r\n[b] and we check for format!("[a]{}[b]", ::NL) in Linux.)

The TOML spec allows a line break to be interpreted in its OS-specific form, but that shouldn't be involved in the reconstruction tests, where we only care about the file-specific form. If the conditional definition of the line break string is going to be used, it must only be used in the API, when the value of a string is to be retrieved.

LeopoldArkham commented 6 years ago

Yes but git automatically changes newlines to the OS standard when you pull/clone. That's why the reconstruction tests pass on appveyor (Linux) even though the original files were created on windows, where they pass as well. Unless I misunderstand, the tests should pass on everyone's machines under the current system, right?

jeremydavis519 commented 6 years ago

Oh, I didn't know that Git did that. If that's the case, then yes, the tests should pass with the conditional definitions. I'm using Windows right now, so I might be able to verify it tonight. I'll let you know if the tests fail.

Edit: I've just confirmed that the tests do pass in Windows now, with the OS-specific line endings, as long as you actually clone the repository. So @LeopoldArkham was right to close the issue. Downloading the repository as a ZIP file leaves the line breaks as \n, which I'm sure I could have found out by reading it somewhere.