vmware-archive / kubecfg

A tool for managing complex enterprise Kubernetes environments as code.
Apache License 2.0
728 stars 62 forks source link

Correct importer treatment of extvars and TLAs loaded from files #266

Closed seh closed 4 years ago

seh commented 4 years ago

Here we correct a few capabilities exposed by the recent #265:

NB: At the moment, the new test function TestShowUsingExtVarFiles fails due to some interference with the use of environment variables in the sibling TestShow function. I haven't figured out why that's so yet.

seh commented 4 years ago

Ah, the problem is that we're using the RootCmd value several times, but it mutates persistent values to store previously set flags. That's going to be ugly to fix. Perhaps creating a fresh command each time is feasible. I'll look into it more tomorrow morning.

seh commented 4 years ago

Perhaps we could change the arg file names to absolute form as part of building the import statements (and then only attempt to parse importedFrom if the file arg is relative)?

Do I understand you correctly that you'd prefer to resolve the file path against the current working directory when parsing the flags, so that we wind up with an absolute path from the start?

If so, I'll try that tomorrow and see how it goes.

Also, if I'm reading it right, the Go Jsonnet port bypasses explicit treatment of these strings by relying on path.Split returning an empty directory path for these special "importedFrom" values. Since path.Split returns an empty directory, path.Join winds up returning the imported path as is later in (*FileImporter).tryPath, which isn't absolute, despite the variable name "absPath." That's sort of lazy, and sort of clever, if that's really what's going on. @sbarzowski , is that a proper summary of how it works?

sbarzowski commented 4 years ago

@seh Oh, I see what is going on. Basically the fact that your importer has to care at all about dummy locations for extVars/TLAs is a bug in go-jsonnet. These "sentinel values" were just meant to be displayed in a stack trace if necessary. I need to look a bit more at it.

This uncovers two bugs actually: 1) Using import in extVars/TLAs uses some arbitrary "readable" dummy location as importedFrom. Do you have any thoughts on what the correct one should we determine the correct importedFrom? E.g. in the context of filesystem imports or your URL importing. 2) ExtVars/TLAs also happen to be processed using the same mechanism as an import, so these dummy locations are also used for caching in importCache. This is completely unnecessary and easy to fix.

seh commented 4 years ago

These "sentinel values" were just meant to be displayed in a stack trace if necessary. I need to look a bit more at it.

They do make their way down into calls on the Importer. If the Importer happens to ignore them successfully, there's no harm in doing so. However, in our case here, it has proved to be difficult to ignore them entirely.

Do you have any thoughts on what the correct one should we determine the correct importedFrom?

Angus's suggestion above is an adaptation of what I implemented yesterday—namely, for these imports, use the process's current working directory as the "importedFrom" directory, in order to turn a relative file path into an absolute path fit for recording (temporarily) in a URL.

It seems that with just these two importers as examples—the one in the Jsonnet Go port, and the universalImporter type here—we see that in both cases, they wind up relying on the "importedFrom" parameter to supply context missing from the "importPath" parameter. I haven't thought through this problem for long enough, only having bumped into it yesterday, but if we had the freedom to change the Importer.Import signature, I'd consider adding a third parameter providing this context. That is, the "importedFrom" parameter would stay, and the current argument values passed to it would remain the same (useful for diagnostic statements), but there would be a third "importContext" parameter that would supply the current working directory for extVars and TLAs, or the containing directory of a file nominated in "importedFrom."

I'm less sure how to accommodate URL-based importers, though; is it always the case that a "file" nominated on the command line for, say, --ext-code-file is a file path, or does it just need to be some opaque string that an Importer can consume? If it's the latter, we can't be sure that the current working directory is meaningful context for such a pseudo-file specifier.

  1. ExtVars/TLAs also happen to be processed using the same mechanism as an import, so these dummy locations are also used for caching in importCache. This is completely unnecessary and easy to fix.

I take it that's an optimization, but that the current caching behavior is still correct.

sbarzowski commented 4 years ago

I take it that's an optimization, but that the current caching behavior is still correct.

Well, these locations are used as caching keys, so if someone has a file at the absolute location "" and an extvar foo, trying to import that file will result in getting the extvar instead. Unlikely to happen, but still wrong. With FileImporter it's not possible at all, because the absolute paths cannot have this form, but some custom importer could allow that.

sbarzowski commented 4 years ago

is it always the case that a "file" nominated on the command line for, say, --ext-code-file is a file path, or does it just need to be some opaque string that an Importer can consume? If it's the latter, we can't be sure that the current working directory is meaningful context for such a pseudo-file specifier.

The API does not have any special notion of "file extVars". The option --ext-code-file var=file.jsonnet is just a shortcut for doing --ext-code var="import file.jsonnet". It is supposed to be an opaque string that an Importer can consume, but we don't need to worry about it, for two reasons. 1) It's basically an import like any other. 2) These options only concern the command and it's currently not possible to use a custom importer with the command anyway.

seh commented 4 years ago

These options only concern the command and it's currently not possible to use a custom importer with the command anyway.

Ah, but here in kubecfg, we did just recently add these same command-line flags, and we do have a custom Importer. Hence the struggle to implement them together correctly.

sbarzowski commented 4 years ago

Ah, but here in kubecfg, we did just recently add these same command-line flags, and we do have a custom Importer.

I see. It seems that it is implemented by putting the import there just like in the interpreter. So it will expect the opaque-string-for-importer, the way it is implemented. This is a good thing, because then if you do --ext-code-file var=file.jsonnet, and file.jsonnet contains imports, we know the importedFrom for them (since file.jsonnet is imported like anything else). It also means that the commandline has the full power of the importer, so if your importer accepts urls, you can use them in these options as well.

So it seems to me, that the main problem is handling of --ext-code="import something" (regular extvar, which imports something). Once we have that all should fall into place.

sbarzowski commented 4 years ago

What if we just passed "" (an empty string) as importedFrom in such situations (extVars, probably also code that is provided as commandline? Then it will be the responsibility of the importer to decide what starting point to use (or whether to do a relative lookup at all). It seems less opinionated than passing cwd explicitly (the importers may use something very different from regular paths).

What do you think?

seh commented 4 years ago

Well, getting the tests in file _showtest.go required hacking at the FlagSet in the "RootCmd" cobra.Command, which in turn required upgrading the pflag dependency. I hope that's an acceptable sacrifice, @mkmik and @anguslees.

seh commented 4 years ago

What if we just passed "" (an empty string) as importedFrom in such situations (extVars, probably also code that is provided as commandline? Then it will be the responsibility of the importer to decide what starting point to use (or whether to do a relative lookup at all). It seems less opinionated than passing cwd explicitly (the importers may use something very different from regular paths).

That would get us back to what I implemented in commit 2780af471eba78ee780b71e26258a6bdd16c334e, but instead of using that regular expression to match "importedFrom" for things like "\<top-level-arg:...>," we'd instead look for the sentinel empty string. If the "importedFrom" parameter value is empty like that, we'd know that we need to fall back to using our default base (the current working directory for files, at least).

Per @anguslees's suggestion, in commit 00054cce25f209ef879214a55e8537e947d7e26d I moved that treatment out of the Importer up to what's effectively the program's main function. It works, but I'd rather isolate this behavior in the Importer. I compromised in my earlier pass by having our custom Importer accept a constructor parameter to specify the base URL to use as a fallback, so that the decision to use the process's current working directory was still made outside of the Importer.

sbarzowski commented 4 years ago

instead of using that regular expression to match "importedFrom" for things like "," we'd instead look for the sentinel empty string

Yes. The important difference is that then it would be relying on a documented behavior. And the decision what "base" to use needs to be made somewhere. Importer seems like the right place to me.

Per @anguslees's suggestion, in commit 00054cc I moved that treatment out of the Importer up to what's effectively the program's main function.

This is quite an effective workaround, but there are two small problems: 1) If a user of kubecfg uses --ext-code=import sth.jsonnet, they will still be affected (unless they too are going to use only absolute paths). 2) I would like to actually fix the go-jsonnet API here :-)

seh commented 4 years ago
  1. If a user of kubecfg uses --ext-code=import sth.jsonnet, they will still be affected (unless they too are going to use only absolute paths).

There, we accept sth.jsonnet as a relative file path, but then use the current working directory to turn that into an absolute file path and frame it as a "file"-scheme URL for use by the Importer.

  1. I would like to actually fix the go-jsonnet API here :-)

Oh, that would be wonderful too. I'm not arguing against that. I just don't expect that I can wait long enough for that change to be released, so I'm trying to get this change approved now, with what we understand to be true today.

If this project can later upgrade its dependency on a go-jsonnet version that passes those empty strings through, I'd propose going back to an Importer implementation more like what I did in commit 2780af471eba78ee780b71e26258a6bdd16c334e.

sbarzowski commented 4 years ago

There, we accept sth.jsonnet as a relative file path, but then use the current working directory to turn that into an absolute file path and frame it as a "file"-scheme URL for use by the Importer.

I see that you do that if the user passes --ext-code-file var=..., but the user can also pass --ext-code var=.... Then you just get some code which may contain imports. To make the paths absolute there, you would need to traverse the AST (or handle it on the side of importer). I don't think it's a big problem in practice (who passes a complex piece of Jsonnet with imports as an extvar directly as a commandline argument?).

seh commented 4 years ago

the user can also pass --ext-code var=.... Then you just get some code which may contain imports. To make the paths absolute there, you would need to traverse the AST (or handle it on the side of importer).

Oh, good point! In that case, the "importedFrom" value would once again be one of these "\<extvar:...>" placeholders, and our Importer would reject a non-URL "importedPath" argument.

seh commented 4 years ago

I would like to actually fix the go-jsonnet API here :-)

@sbarzowski, do you intend to pursue this soon? Assuming that we won't see an implementation or release within a few days, @anguslees, are you willing to proceed with an intermediate solution here, with the intention to come back through and update it to play along with a change in the Jsonnet Importer interface contract?

mkmik commented 4 years ago

FWIW I'm fine with an intermediate solution

sbarzowski commented 4 years ago

@seh I want to fix the API soon, in a few days. I don't have any specific timeline for the release, but we had one quite recently on the 16th, so I would rather avoid rushing another one.

seh commented 4 years ago

Thank you for the update. Given that, are there any further suggestions from reviewers for what I should change here? I expect the most controversial part is resetting the flags on cmd.RootCmd in the unit test.

Once this one merges, I intend to submit an anticipatory patch for consideration that expects Importer.Import to receive an empty "importedFrom" argument. We'd have to coordinate that change with @sbarzowski's proposed change to Jsonnet.

seh commented 4 years ago

@mkmik, I've been testing a locally-built kubecfg with these features, and it's working well. Could we push a fresh release that includes this patch? If you'd like me to get more involved to be able to issue such releases for the project, let me know.

seh commented 4 years ago

Once this one merges, I intend to submit an anticipatory patch for consideration that expects Importer.Import to receive an empty "importedFrom" argument.

See #271.

seh commented 4 years ago

What if we just passed "" (an empty string) as importedFrom in such situations (extVars, probably also code that is provided as commandline?

See google/go-jsonnet#329 for a request for establishing this contract.

seh commented 4 years ago

See google/go-jsonnet#447 for further movement on this front.