Open macrozone opened 2 months ago
Pull requests are welcome
Pull requests are welcome
Hi, this is my first time contributing to Node.js, I will take this issue as my first contribution. I think this is a good starting place.
added additional information around dotenv and the implications of the bug
For reference, here are parsing rules from dotenv project page:
The parsing engine currently supports the following rules:
BASIC=basic
becomes {BASIC: 'basic'}
#
are treated as comments#
marks the beginning of a comment (unless when the value is wrapped in quotes)EMPTY=
becomes {EMPTY: ''}
)JSON={"foo": "bar"}
becomes {JSON:"{\"foo\": \"bar\"}"
)trim
) (FOO= some value
becomes {FOO: 'some value'}
)SINGLE_QUOTE='quoted'
becomes {SINGLE_QUOTE: "quoted"}
)FOO=" some value "
becomes {FOO: ' some value '}
)MULTILINE="new\nline"
becomessource: https://github.com/motdotla/dotenv?tab=readme-ov-file#what-rules-does-the-parsing-engine-follow
There is one more invalid use case:
MP_#CRAZY_COMMENT="2: foo bar\ni am "on" newl\nine, 'yo'"
is parsed into env by Node, but dotenv skips it.
@anonrig @macrozone How strict do we want to be with dotenv compatibility? I've got a working fix that is handling all tests of dotenv. It improves compatibility and simplifies parser, but behaves differently with multiline "" values that have unbalanced ".
Covering all edge-cases of dotenv without using their regexp is pretty crazy.
Covering all edge-cases of dotenv without using their regexp is pretty crazy.
Agreed. We started following dotenv through their tests, but I think it's ok to diverge from there for extreme edge cases. I don't think we need to be 100% compliant with dotenv.
@anonrig @macrozone How strict do we want to be with dotenv compatibility? I've got a working fix that is handling all tests of dotenv. It improves compatibility and simplifies parser, but behaves differently with multiline "" values that have unbalanced ".
Covering all edge-cases of dotenv without using their regexp is pretty crazy.
I actually personally don't care about the compatiblitiy, but at the moment some env var values are absolutly impossible to declare. Namly one that contains: a line break, a double quote a backtick and a single quote. There is no way to declare such a env var.
Allowing to escape quotes would also solve it, but that was attempted and rejected because "its not compatible with dotenv".
I am also fine when it breaks with actual line breaks, since thats also bugged in dotenv. When using quotes you can encode line breaks with \n
.
So if this works, it would be fine for me:
MY_VAR="singlequote: ', double quote: ", a line break: \n(i am on newline) and a backtick: `. that is all i need"
I agree that "a line break, a double quote, and a single quote" should be supported, and it is considered as a bug.
I agree that "a line break, a double quote, and a single quote" should be supported, and it is considered as a bug.
don't forget the backtick 😁 (woops, i forgot also to mentione it above)
Just tried your example:
.env file:
MP_MY_VAR="singlequote: ', double quote: ", a line break: \n(i am on newline) and a backtick: `. that is all i need"
dotenv:
MP_MY_VAR=singlequote: ', double quote: ", a line break:
(i am on newline) and a backtick: `. that is all i need
my changes:
MP_MY_VAR=singlequote: ', double quote: ", a line break:
(i am on newline) and a backtick: `. that is all i need
So it looks like it will handle it exactly like dotenv.
I already had to make it much more complex than needed to handle all other edge cases. It would be such a simple parser if we only had to look for balanced double-quotes and newlines.
don't forget the backtick 😁 (woops, i forgot also to mentione it above)
Now we are getting away from reality, lol. What's the usecase/example of an environment variable that contains all of these characters?
@macrozone Added your case to tests and opened a PR: https://github.com/nodejs/node/pull/54215
Now we are getting away from reality, lol. What's the usecase/example of an environment variable that contains all of these characters?
It should not be node's decision what is an allowed env var value and what not. An env var value is a string and any string should be somehow be encodeable in a .env file.
In my case I am writing tooling that creates those .env files on the fly in a ci/cd pipeline from another store. Those can be any strings and Its hard to mirror arbitrary decisions what are valid strings and what not. This is how I noticed those problems in the first place.
luckily fixing the inner quotes problem solves the issue.
(also in bash its no problem to declare such a variable thanks to escaping
MY_ENV_VAR="singlequote: ', double quote: \", a line break:
(i am on newline) and a backtick: \`. that is all i need" node envtest.js
)
@macrozone Added your case to tests and opened a PR: #54215
thank you so much for your effort!
Version
v20.16.0
Platform
Subsystem
No response
What steps will reproduce the bug?
inspired by this comment: https://github.com/nodejs/node/pull/50814#issuecomment-1817949911
create an
.env
file:test with dotenv:
$ node envtest-dotenv.js
test with
--env-file
$ node --env-file=.env envtest.js
How often does it reproduce? Is there a required condition?
always
What is the expected behavior? Why is that the expected behavior?
outputs should be the same.
What do you see instead?
output are different, node native terminates at the first occurrence of the double quote:
compare that to dotenv:
Additional information
And