nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.45k stars 29.54k forks source link

--env-file does not support inner quotes (does not behave like dotenv) #54134

Open macrozone opened 2 months ago

macrozone commented 2 months ago

Version

v20.16.0

Platform

Darwin <redacted> 23.5.0 Darwin Kernel Version 23.5.0: Wed May  1 20:14:38 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6020 arm64

Subsystem

No response

What steps will reproduce the bug?

inspired by this comment: https://github.com/nodejs/node/pull/50814#issuecomment-1817949911

  1. create an .env file:

    INNER_QUOTES="1: foo'bar"baz`qux"
    INNER_QUOTES_WITH_NEWLINE="2: foo bar\ni am "on" newline, 'yo'"
  2. test with dotenv:

    // envtest-dotenv.js
    require("dotenv").config();
    console.log(process.env.INNER_QUOTES);
    console.log(process.env.INNER_QUOTES_WITH_NEWLINE);

    $ node envtest-dotenv.js

  3. test with --env-file

    // envtest.js
    console.log(process.envconsole.log(process.env.INNER_QUOTES);
    console.log(process.env.INNER_QUOTES_WITH_NEWLINE);

    $ node --env-file=.env envtest.js

How often does it reproduce? Is there a required condition?

always

What is the expected behavior? Why is that the expected behavior?

outputs should be the same.

// dotenv output:

1: foo'bar"baz`qux
2: foo bar
i am "on" newline, 'yo'

What do you see instead?

output are different, node native terminates at the first occurrence of the double quote:

// --env-file output
1: foo'bar
2: foo bar
i am

compare that to dotenv:

1: foo'bar"baz`qux
2: foo bar
i am "on" newline, 'yo'

Additional information

And

anonrig commented 2 months ago

Pull requests are welcome

YutongZhuu commented 2 months ago

Pull requests are welcome

Hi, this is my first time contributing to Node.js, I will take this issue as my first contribution. I think this is a good starting place.

macrozone commented 2 months ago

added additional information around dotenv and the implications of the bug

marekpiechut commented 2 months ago

For reference, here are parsing rules from dotenv project page:

What rules does the parsing engine follow?

The parsing engine currently supports the following rules:

source: https://github.com/motdotla/dotenv?tab=readme-ov-file#what-rules-does-the-parsing-engine-follow

marekpiechut commented 2 months ago

There is one more invalid use case:

MP_#CRAZY_COMMENT="2: foo bar\ni am "on" newl\nine, 'yo'"

is parsed into env by Node, but dotenv skips it.

marekpiechut commented 2 months ago

@anonrig @macrozone How strict do we want to be with dotenv compatibility? I've got a working fix that is handling all tests of dotenv. It improves compatibility and simplifies parser, but behaves differently with multiline "" values that have unbalanced ".

Covering all edge-cases of dotenv without using their regexp is pretty crazy.

anonrig commented 2 months ago

Covering all edge-cases of dotenv without using their regexp is pretty crazy.

Agreed. We started following dotenv through their tests, but I think it's ok to diverge from there for extreme edge cases. I don't think we need to be 100% compliant with dotenv.

macrozone commented 2 months ago

@anonrig @macrozone How strict do we want to be with dotenv compatibility? I've got a working fix that is handling all tests of dotenv. It improves compatibility and simplifies parser, but behaves differently with multiline "" values that have unbalanced ".

Covering all edge-cases of dotenv without using their regexp is pretty crazy.

I actually personally don't care about the compatiblitiy, but at the moment some env var values are absolutly impossible to declare. Namly one that contains: a line break, a double quote a backtick and a single quote. There is no way to declare such a env var.

Allowing to escape quotes would also solve it, but that was attempted and rejected because "its not compatible with dotenv".

I am also fine when it breaks with actual line breaks, since thats also bugged in dotenv. When using quotes you can encode line breaks with \n.

So if this works, it would be fine for me:

MY_VAR="singlequote: ', double quote: ", a line break: \n(i am on newline) and a backtick: `. that is all i need"
anonrig commented 2 months ago

I agree that "a line break, a double quote, and a single quote" should be supported, and it is considered as a bug.

macrozone commented 2 months ago

I agree that "a line break, a double quote, and a single quote" should be supported, and it is considered as a bug.

don't forget the backtick 😁 (woops, i forgot also to mentione it above)

marekpiechut commented 2 months ago

Just tried your example:

.env file:

MP_MY_VAR="singlequote: ', double quote: ", a line break: \n(i am on newline) and a backtick: `. that is all i need"

dotenv:

MP_MY_VAR=singlequote: ', double quote: ", a line break: 
(i am on newline) and a backtick: `. that is all i need

my changes:

MP_MY_VAR=singlequote: ', double quote: ", a line break: 
(i am on newline) and a backtick: `. that is all i need

So it looks like it will handle it exactly like dotenv.

I already had to make it much more complex than needed to handle all other edge cases. It would be such a simple parser if we only had to look for balanced double-quotes and newlines.

anonrig commented 2 months ago

don't forget the backtick 😁 (woops, i forgot also to mentione it above)

Now we are getting away from reality, lol. What's the usecase/example of an environment variable that contains all of these characters?

marekpiechut commented 2 months ago

@macrozone Added your case to tests and opened a PR: https://github.com/nodejs/node/pull/54215

macrozone commented 2 months ago

Now we are getting away from reality, lol. What's the usecase/example of an environment variable that contains all of these characters?

It should not be node's decision what is an allowed env var value and what not. An env var value is a string and any string should be somehow be encodeable in a .env file.

In my case I am writing tooling that creates those .env files on the fly in a ci/cd pipeline from another store. Those can be any strings and Its hard to mirror arbitrary decisions what are valid strings and what not. This is how I noticed those problems in the first place.

luckily fixing the inner quotes problem solves the issue.

(also in bash its no problem to declare such a variable thanks to escaping

MY_ENV_VAR="singlequote: ', double quote: \", a line break: 
(i am on newline) and a backtick: \`. that is all i need" node envtest.js

)

macrozone commented 2 months ago

@macrozone Added your case to tests and opened a PR: #54215

thank you so much for your effort!