Switch to PEX for lockfile generation

Eric-Arellano commented 2 years ago

See https://github.com/pantsbuild/pants/pull/13962 for an example and https://github.com/pantsbuild/pants/pull/13963 for prework. Blocked on https://github.com/pantsbuild/pex/issues/1402.

We have to figure out if we want to keep Poetry for now for generating tool lockfiles, but deprecated. Poetry is broken in that [python-repos] does not work, and we also leak the cache globally. So there's motivation to eagerly switch. But, the resolver will change which could break some users, and the lockfile contents have a different format (new lockfile is more useful).

User lockfiles should always use Pex because that's a new feature. It's only tool lockfiles where we need to worry about backwards compatibility.

Eric-Arellano commented 2 years ago

Another big decision is if we want to save on-disk the lockfiles in PEX's JSON format vs using pex3 lock export to use requirements.txt.

Imo, the main benefit of exporting is interoperability with the greater Python ecosystem—requirements.txt is the lingua franca until we have PEP 665. We value interoperability because it facilitates incremental adoption. However, we could get that via ./pants export or ./pants generate-lockfiles --export - that's one extra manual step anytime you change requirements, but probably would be acceptable.

The primary benefit of using PEX's format seems to be that it can store much richer metadata. This may allow us to work around problems like https://github.com/pantsbuild/pants/issues/13965 more easily.

cc @jsirois (this conversation is broken out of https://github.com/pantsbuild/pants/issues/13965#issuecomment-999936114)

jsirois commented 2 years ago

I'm continually flummoxed here. Saving the native format and then exporting from that format only when needed is a thing.

jsirois commented 2 years ago

Or doing both pre-emptively if you want to save a step!

jsirois commented 2 years ago

Imo, the main benefit of exporting is interoperability with the greater Python ecosystem—requirements.txt is the lingua franca until we have PEP 665.

Can you do a bit more work here? So, cases:

Team has poetry / pyproject.toml dependencies and no requirements.txt.
Team has poetry / pyproject.toml dependencies and they do manually export a requirements.txt already for interoperability.
Team does not use poetry but they do use requirements.txt already.

How is any of these 3 scenarios affected by Pants storing its lock file in format foo? Afaict none of them are. Maybe there are 4th and higher scenarios I'm missing?

Eric-Arellano commented 2 years ago

Team does not use poetry but they do use requirements.txt already.

We encourage users to only declare direct deps, not transitive deps. From that, Pants will generate a lockfile for you. If we generate the lockfile in requirements-txt format, then you can now do things like point your IDE to the lockfile rather than the less precise input requirements.txt. Whereas if we store the lockfile with Pex format, the IDE won't understand it.

That is, the greater ecosystem is consuming the generated lockfile, rather than whatever the original input was to create it.

jsirois commented 2 years ago

Sure. My point is this: are we adding extra steps in any scenarios? I think the answer is no. We can emit a lockfile of random format as well as a (potentially lossy) requirements.txt for external tooling use in 1 step. Do you agree this is true and do you see any downsides?

Eric-Arellano commented 2 years ago

are we adding extra steps in any scenarios? I think the answer is no. We can emit a lockfile of random format as well as a (potentially lossy) requirements.txt for external tooling use in 1 step. Do you agree this is true and do you see any downsides?

Depends on what running generate-lockfiles does by defualt.

only uses requirements.txt-format: no extra step
only uses PEX format: you'll have an extra step to run ./pants generate-lockfiles --export
uses both formats: no extra step, but now 2 files describing the same thing

I'm not saying we must use requirements.txt format by default - it depends how much we value saving a step vs the downsides. I'm only pointing out it's not obvious to me which format to use by default.

stuhood commented 2 years ago

I'm not saying we must use requirements.txt format by default - it depends how much we value saving a step vs the downsides. I'm only pointing out it's not obvious to me which format to use by default.

There are some benefits to formats that simplify installers: see the discussion starting here. If the lockfile format actually contains the dependency graph, then wheel selection becomes a lot cheaper (similar to PEX subsetting, afaict from @jsirois's explanations).

jsirois commented 2 years ago

There are some benefits to formats that simplify installers

This is true but irrelevant to my point.

uses both formats: no extra step, but now 2 files describing the same thing

This is a poor argument in my opinion. Are two files really such a problem? Are they really that hard to explain or grok? AKA: Are Poetry users that need compatibility with a poetry export really angry at this asking why did the Poetry authors make such a bad decision as to lock in not-requirement.txt format?

Bottom line, compatbility is important, but a non-lossy format is required. I think its easy to explain "Turn on pants-option-foo to get your lock mirrored in a format other tools understand. The other option is to embed the non-lossy data in a json comment as is currently done for lockfiles.

Eric-Arellano commented 2 years ago

To clarify, I'm talking about the default behavior for Pants. It would be noisy for us to output both Pex format + requirements.txt by default. Not at all annoying if you set an option to request that, which is why no one complains about Poetry.

but a non-lossy format is required.

This was the point I was missing before. And I see why I was not clear on this: currently, PEX consumes Poetry tool "lockfiles" via -r lock.txt --intransitive. It sounds like we are planning on Pex consuming native Pex lockfiles, rather than using -r lock.txt, correct? Which makes sense.

In summary: it makes sense to use PEX format by default. We can also choose to expose some export mechanism.

I'll think more about how to handle deprecating tool lockfiles using a new format - that's pretty solvable.

jsirois commented 2 years ago

This was the point I was missing before. And I see why I was not clear on this: currently, PEX consumes Poetry tool "lockfiles" via -r lock.txt --intransitive. It sounds like we are planning on Pex consuming native Pex lockfiles, rather than using -r lock.txt, correct? Which makes sense.

Correct.

I think your error here was "Pex consumes" - it's Pip that consumes. As a result we get the lock security and stability properties of pinned versions and checked hashes but we do not get the lock consume-side performance; ie: we do not get parallel downloads of dists with 0 dependency checking. To get that you need to write down all the URLs (for fetching) + wheel names (for tag matching). To further get subsetting perf - i.e. non-full-downloads, you need to write down the Requires-Dist of each locked distribution, etc so this can all be evaluated to form a subset of urls to then download in parallel. This is not to mention the whole problem of VCS and local project requirements. Those cannot be represented in requirements.txt format today with Pip as the consumer.

Eric-Arellano commented 2 years ago

Pex now supports consuming Pex JSON lockfiles https://github.com/pantsbuild/pex/pull/1654, so we are unblocked from switching from Poetry to Pex for lockfile generation & consumption.

This will be a soft launch in Pants 2.11, with post-2.11 work to support locking of VCS requirements, along with making locks more robust overall given how complex Python packaging is.

Given the soft launch, three safety valves:

Poetry lockfile generation stays as an option.
An option I added last week [python].resolve_generate_lockfiles to allow you to manually manage your user lockfiles
We continue to not require using [python].resolves or [tool].lockfile.
- I'm not sure though if we should deprecate [python].requirement_constraints

Implementation

Generation

We already support Pex for lockfile generation.

https://github.com/pantsbuild/pants/blob/6ba05e3ceb585b0ff52575924fb423ad0e1a2738/src/python/pants/backend/python/goals/lockfile.py#L142-L146

All we need to do is figure out where to put the option toggling between Pex and Poetry, and then wire it up in the rest of lockfile.py. Probably [python].lockfile_generator = {'pex', 'poetry'}? In 2.12, we can hopefully deprecate to always use Pex.

We do need to decide if we default to Pex or Poetry in 2.11.

Consumption

I think that we need to support mixed mode, where some lockfiles can be Pex JSON and some requirements.txt style a la Poetry. Why? We want Pants's default tool lockfiles to be Pex based, but let the user use what they prefer.

That suggests to me that we should decide the lockfile consumption strategy via inspection of the lock. Is it JSON? Use Pex. Otherwise, use normal requirements.txt. We have not yet figured out the recommend naming scheme for Pex lockfiles, e.g. isort.json.lock: one naive way of implementing this detection is to look for json in the string. But it is possible for users to rename the file to whatever they want, so this is likely not reliable.

I believe that we only then need to update this code block to detect the lockfile format and decide whether to use --lock or --requirement

https://github.com/pantsbuild/pants/blob/6ba05e3ceb585b0ff52575924fb423ad0e1a2738/src/python/pants/backend/python/util_rules/pex.py#L410-L442

Note that per https://github.com/pantsbuild/pants/issues/14281, we will need to strip Pants's lockfile header from the JSON lock because the comments cause the JSON to be invalid.

thejcannon commented 2 years ago

We do need to decide if we default to Pex or Poetry in 2.11.

If 2.11 is a soft-launch, sounds like poetry is the answer :wink: Better UX, IMO.

That suggests to me that we should decide the lockfile consumption strategy via inspection of the lock

That sounds kinda brittle. Would it make more sense for this to come from the config and 1) have a sane default (likely the lockfile generator) + 2) Allow users to override. All in all I think the addition of one more config value is worth the simplicity and absolute correctness.

Note that per https://github.com/pantsbuild/pants/issues/14281, we will need to strip Pants's lockfile header from the JSON lock because the comments cause the JSON to be invalid.

Does this mean that Pants won't have the ability to know if the lockfile is "out of date"? If so, can we insert metadata or use a permissive format (perhaps jsonc https://code.visualstudio.com/docs/languages/json#_json-with-comments?) and then convert to raw JSON for PEX's consumption

Eric-Arellano commented 2 years ago

Better UX, IMO.

I'm not convinced on that. Poetry does not support [python-repos], has the issue with transitive deps & env markers that is really confusing to workaround, and it will have worse performance when installing the lock.

The only reason I'm not certain we should do Pex by default is it's a new technology in a very complex ecosystem: it's likely there will be issues.

I'm leaning much more towards Pex being the default, and then we have a safety valve of falling back to Poetry. Maybe we wait for a week's worth of dogfooding Poetry before changing.

That sounds kinda brittle. Would it make more sense for this to come from the config

I don't love it either. I don't think config is super viable though because it presumes that users are regenerating every single lock after they change an option like [python].lockfile_generator. There is no guarantee of that.

We can't rely on lockfile header telling us what to do because the header might not be there. Unless we do some complex scheme like assuming if the header is missing, it must not be PEX JSON, else look up metadata..But that breaks the spirit of [python].invalid_lockfile_behavior = ignore. Unclear if that option should exist if we decide to get rid of manually managed lockfiles, but in the meantime I think we want to be consistent with it.

Does this mean that Pants won't have the ability to know if the lockfile is "out of date"?

No because Pants will first parse and validate the header, and only then strip it right before passing to Pex.

jsirois commented 2 years ago

UX is important, but UX aside, Poetry is broken for users of Pants generally. It does work for cases that don't need certain aspects of [python-repos] as Eric Points out, or any of the other myriad corners Pip (and thus Pex) support. Apparently psycopg, for example can't be used properly unless your lock will never contain a wheel for it. You can do that with Pip / Pex by passing --no-binary psycopg but you can't do that with Poetry. So, hopefully we can all agree having something that actually works trumps UX. Apparent good UX built on top of a broken foundation just means you'll have a pleasant walk down the lane until you run into the Ogre, at which point you slam into reality and are fully without options. I think the only complicating factor here is that we released with Poetry and have tangles to deal with as a result - but only due to that, not due to Poetry actually being a viable solution here at all.

thejcannon commented 2 years ago

I should've been explicit. By better UX I meant a better experience for the users in not changing the default generator to one that hasn't been battle tested. Agreed PEX's lockfile experience is better overall.

thejcannon commented 2 years ago

I don't love it either. I don't think config is super viable though because it presumes that users are regenerating every single lock after they change an option like [python].lockfile_generator. There is no guarantee of that.

What's the utility of changing the value if it isn't being used? You could maybe argue that future lockfiles would use the new format, but maintaining both kinda seems niche.

You might be able to solve this problem (and I'd argue might be good UX anyways) to expose an easy way to convert my current txt file into PEX JSON while strictly maintaining pinned versions. If that existed the barrier for entry would be substantially lower, and wanting to maintain both Pip txt and PEX JSON can be a non-starter.

Eric-Arellano commented 2 years ago

What's the utility of changing the value if it isn't being used?

I can think of three instances:

Using Pantsbuild's default tool lockfiles that use Pex format, but using Poetry for yourself. (Although we could hack around this)
You want to roll out the upgrade incrementally. For example, only switch tool lockfiles for now because lower risk. Note that users can dynamically change the option, e.g. a generate_lockfiles script that sets the CLI flag
You update but simply don't realize you needed to regenerate.

We could eagerly error with something like scenario 3, but note we have the same problem! How do we know in pex.py if it's a Pex lockfile or requirements.txt style lockfile?

thejcannon commented 2 years ago

Scenario 1. isn't quite the same because it's not a resolve, right? It's not hard to define in each tool which generator is being used for the lockfile (if any). The hard part is resolves are treated by Pants as being homogeneous (as evidenced by resolves_generate_lockfiles being a scalar).

Scenario 2 also involves only tool lockfiles. But we can pretend your example was incremental in the repo's resolves. In that case there should be no danger in switching file formats if the resulting venv has the exact same versions, right? And that can be accomplished with a migration script.

Scenario 3 is easily caught by CI either erroring/warning (depending on invalid_lockfile_behavior or crashing-and-burning-but-with-a-helpful-message if you set it to ignore).

thejcannon commented 2 years ago

From "offline" discussion the strategy for each lockfile will be:

See about using jsonc for the lockfile. (@jsirois thoughts on having PEX allow jsonc format? You might not need a vendored dep to support this, as I would imagine stripping C-style comments is reasonably straightforward.)
(assuming PEX doesn't support jsonc), try and strip C-style header
Regardless of pass-or-fail for step 1. try and parse as JSON
If that succeeds, must be PEX json, otherwise must be reqs.txt :tada:

jsirois commented 2 years ago

I'm a bit dumbfounded on the whole lock format business still being a thing. What about package.json when we support js? Pants should generally not care or know the lock file format regardless of language.

thejcannon commented 2 years ago

You have a very fair point. I believe the necessity for the "header" comes from wanting to ensure the lock is "up to date". Otherwise the only need is for Pants to know which implementation to use.

I see that the PEX lockfile has contraints and requirements. in it. @jsirois is there a way for Pants to ask PEX (because we shouldn't be parsing the JSON ourselves):

"Is this a valid PEX lockfile?"
"Is this lockfile valid for these constraints?"

if PEX can answer those questions, all assumptions are then gone.

jsirois commented 2 years ago

Pex could definitely answer those questions. That said, Pex, Python, JVM, Coursier - everything aside!, Pants can generally not rely on this nor should it. Pants can really only insert its header with state and unconditionally strip it before calling the underlying tool. Is that not true and sane?

jsirois commented 2 years ago

Stepping back, Pants is really punting on the fact it has no permanent shared storage and its not being bold enough to commit to one. Instead, its hitching a ride on git and tacking its data on a file already globally stored via git. Another option would be to commit to its own storage - aka, seperate files. Eric already throught-explored this terrain and rejected IIRC. So if the remaining option is piggy-backing, it truly should be piggy-backing and the body content should be treated as binary and opaque.

thejcannon commented 2 years ago

I think there's two discussions here:

Knowledge of format (PEX vs not PEX). This only matters because Pants supports multiple tools in a single ecosystem (not unique to Python. See npm and yarn). We need a way to delineate. So either PEX allows us to ask if a lockfile belongs to PEX or we have to decide some other way (in this case the easy route is assuming JSON == PEX).
How to keep a lockfile healthy. This is more vague. I think if a tool (PEX) allows us to query if a lockfile is valid given input XYZ that is likely more powerful than pants just comparing a "header". Otherwise Pants is forced to default to a "header". If PEX won't/can't provide this facility, then we have to insert a header. Then the only thing to consider is "can we insert a header that doesn't break the existing file-format". It's an important question as it's 100% valid to assume tools other than PEX might be parsing the lockfile. (Look at all the tools that understand pip-style textfiles)

Point 1 is irrespective of the header. For point 2, Pants certainly could "insert its header with state and unconditionally strip it" but that is the poorest solution that IMO should only be explored when other (better) solutions don't exist.

jsirois commented 2 years ago

The downside of the above is the header make make the file invalid for the underlying tool; so the file can no longer be used directly with underlying tools if there is a split workflow or you want to experiment, etc. That points to separate storage or committing to strong-arming any such file / ecosystem we run across. You can of course strong arm Pex, its a friend of Pants. But do you want to commit to that philosophy generally?

jsirois commented 2 years ago

In point 1 @thejcannon have we all been missing the fact that Pex doesn't impose a file name convention and so Pants could and thus know that way?

Eric-Arellano commented 2 years ago

In point 1 @thejcannon have we all been missing the fact that Pex doesn't impose a file name convention and so Pants could and thus know that way?

We're all on the same page here. Looking at file path is a bad heuristic. We plan to try parsing as JSON instead to determine it's PEX.

Another option would be to commit to its own storage - aka, seperate files. Eric already throught-explored this terrain and rejected IIRC.

Indeed. See https://github.com/pantsbuild/pants/issues/14281

For point 2, Pants certainly could "insert its header with state and unconditionally strip it" but that is the poorest solution that IMO should only be explored when other (better) solutions don't exist.

Why is it the poorest? What John is saying does make sense to me that the underyling lock body should be a black box we don't touch. In fact, we actually write our own Pants-lock format for Coursier, but we still implemented lockfile header validation via TOML comments for this reason.

Joshua, you alleviated my concern about IDEs complaining about comments in the Pex JSON lockfile by pointing out that you can call the file .jsonc to make PyCharm + VSCode happy.

The one compelling concern I have is you can't directly pipe the lockfile to stand-alone Pex. That's a bummer, but seems acceptable to me.

jsirois commented 2 years ago

If everyone wants to continue punting the general problem here, another option is to just add fields to the Pex json. Pex purposefully only cares about the fields it cares about in all its json. You can;t add comments but you could add a top-level object with all the structure you care about.

jsirois commented 2 years ago

I'll summarize from the Pex side:

If Pants deems it must, it can always strong arm Pex - I'll fight you hard but eventually give up.
Pex json admits fields it doesn't care about.
Pex lock json can come in a file of any name.
Pex could sprout more commands to allow for introspection and I'd be fine with this; i.e. checking lock staleness, etc.

I always prefer not punting and I view not solving the Pants storage needs problem as punting. I do understand though that timelines are a thing and so maybe punt is wisest. I think adding features to Pex to make this particular Pants feature work is valid, but also tech-debt since it again punts on the real problems and the un-friends-of-Pants tools / ecosystems that may not be so malleable down the road.

jsirois commented 2 years ago

FWIW, Pants 1 did not punt. It did not support lockfiles but it did support publishing jvm jars to maven central. It comitted to an on-disk database (tree of files) that were required to be checked in to track publish state. This worked well. We dogfooded it for Pants own JVM tool jars: https://github.com/pantsbuild/pants/tree/1.30.x/build-support/ivy/pushdb

Eric-Arellano commented 2 years ago

another option is to just add fields to the Pex json. Pex purposefully only cares about the fields it cares about in all its json. You can;t add comments but you could add a top-level object with all the structure you care about.

At that point, I don't see much benefit over directly inspecting the JSON for requirements and constraints? We're still leaking into implementation details of Pex.

I always prefer not punting

You do raise an excellent point about JavaScript ecosystem.

A quick audit of languages. Who does support comments:

Go, go.sum
JS, yarn.lock
Rust, cargo.lock
Java/Scala, custom Pants lockfile format

Who does not:

Python, Pex
JS, package-lock.json
- can use a gross hack to add // as a requirement
- the non-hacky way to do this is add comments to package.json itself via dependenciesComments field etc

While I don't know what Pants's JS support will look like, there's a non-trivial chance we want to support package-lock.json. Having the user call it package-lock.jsonc isn't viable - we break interop with the rest of the Npm ecosystem.

jsirois commented 2 years ago

At that point, I don't see much benefit over directly inspecting the JSON

We definitely see this differently! If you want to maintain mutiple logic forks to deal with this as the Pex format evolves, you're of course welcome to. If you just stick in a top-level "pants...." key with the Pants configuration object you could divorce yourself from all that.

thejcannon commented 2 years ago

Joshua, you alleviated my concern about IDEs complaining about comments in the Pex JSON lockfile by pointing out that you can call the file .jsonc to make PyCharm + VSCode happy.

Well to @jsirois point, using jsonc implies that PEX is using JSON. Which isn't ideal.

I think what I'm trying to achieve here is the most harmonious solution with a tool which is friendly-to-Pants so to speak, with the least assumptions. In order to achieve no assumptions the only alternative is to rely on the underlying tool to answer the questions we must ourselves answer. In that case, for fewest assumptions we need PEX to answer those two questions above. It sounds like you're OK with that as well (and perhaps it's relatively trivial given the nature of the questions and the metadata that already exists in the lockfile)

EDIT: Stripped last paragraph, bad assumption

jsirois commented 2 years ago

I think the minimum of assumptions + work is stuff a field in the JSON, but if you want to add the tooling to Pex, I'm on board.

thejcannon commented 2 years ago

I think the minimum of assumptions + work is stuff a field in the JSON

I thought you were against us assuming JSON :grimacing:

But, and correct me if I'm wrong, I think it's valuable for a non-pants user to ask PEX if a lockfile is valid given a set of inputs (especially the set of inputs which is responsible for the generation of the file), right?

thejcannon commented 2 years ago

I also assume every actual lockfile-generating -tool (Cargo, yarn, etc... not pip) has similar functionality to check if the lockfile is out of date.

Eric-Arellano commented 2 years ago

Those approaches of either a) new Pex field for Pants, or b) asking Pex directly for the metadata are both strong-arming Pex. We can't do that with package.json, John has an important point there. And using that gross hack of adding // as a fake requirement smells wrong to me.

I reopened https://github.com/pantsbuild/pants/issues/14281. I closed it prematurely, it should be considered as a holistic solution to this, as John suggests with how v1 handled publishing metadata. @thejcannon can you please take a look?

thejcannon commented 2 years ago

b) asking Pex directly for the metadata

You misunderstand. I don't want to ask for the metadata and compare. I want to hand PEX the metadata and ask "is this still legit?". I imagine npm/yarn likely has some kind of similar functionality (possibly built-in, but hopefully exposed via an API)

jsirois commented 2 years ago

I thought you were against us assuming JSON

I am! I'm just saying, Pants already commits the sin of knowing its OK to slap comments on JVM locks because they are toml (Pants happens to write these locks so it of course can know). You all don't want to do this the totally generic solid way - seperate storage. As such, Pants is currently OK with knowing file formats. I'm not for that. But given that, the stuff a field is probably the thing that minimizes effort and coupling.

I'm OK with Pex tools too, but that is completely not generalizable for Pants and I think Pants is just punting the real problem here doing that.

Hopefully that's a bit more clear.

3 options:

Keep knowing about lock file formats generally - stuff a field in JSON for the Pex lock format.
Keep knowing about formats for the JVM side, but for PEX don't do this. Instead of a general approach to decopuling (seperate storage), use a bespoke relationship with Pex to make it sprout tools to support Pants ad-hoc ignorance.
Be generally ignorant and implement seperate storage.

jsirois commented 2 years ago

We can't do that with package.json

That may be true. I have no clue how validation works. They too, may ignore unkown fields permitting the stuff a field approach. That's besides the general point of what if you hit an actual adversarial format. It may be true though that worrying about the general problem may be worrying about a problem that doesn't exist in practice. I'm just too ignorant of the js ecosystem to say anything definitive about package.json wrt extra fields. I feel safe in saying no # comments though!

Eric-Arellano commented 2 years ago

want to hand PEX the metadata and ask "is this still legit?".

I don't think this will be very viable. Pants semantics are non-trivial, like that user requirements only need to be a subset but tool requirements must be an exact match.

https://github.com/pantsbuild/pants/blob/cd44dd1ffc5671a4e7fb1d488f5fa9b0612ca262/src/python/pants/backend/python/util_rules/lockfile_metadata.py#L182-L206

We also want tight control of the error message. Read this line and below

https://github.com/pantsbuild/pants/blob/cd44dd1ffc5671a4e7fb1d488f5fa9b0612ca262/src/python/pants/backend/python/util_rules/pex_requirements.py#L109

We could coerce Pex's tool to work with Pants—like giving us back the exact info we need for those fancy error messages—but that's strong-arming it and is brittle if we need to start validating on future metadata.

thejcannon commented 2 years ago

I'm sold on https://github.com/pantsbuild/pants/issues/14281 if the hash is included in the metdata.

One file or simplicity
A hash of the lockfile for complete safety
Separate storage for full ignorance of various tools

jsirois commented 2 years ago

I need to get back to work, but one more thing to think about: what to do about storing the separate storage. The Pants v1 publish task committed those files automatically. Pants v2 may or may not want to do that. It may just want to document, warn and wag fingers about what happens when you don't.

jsirois commented 2 years ago

Last thought - of the 3 options I listed, 1 is clearly the most pragmatic. 3 here will certainly be a detour. I do think we'll need to take that detour at some point however.

stuhood commented 2 years ago

I think pragmatism is in order here, since there are enough other things to do to get locks working end to end. So I'd vote for 1. If we need to change strategies to 3 at some point in the future, we won't necessarily need to change for all tools: only the tool which doesn't support stashing a field/comment. And that reduces the consequences of this decision quite a bit.

Eric-Arellano commented 2 years ago

There's an even easier option 0: the original plan of stuffing the comment into the JSON file then stripping before we pass to Pex.

I'm not that worried about the Pex lockfile not being directly consumable by Pex - unlike package-lock.json, we don't have a bigger ecosystem we're integrating with that requires this lockfile format.
For IDE integration, we can have users call the file .jsonc or associate their IDEs with that file type. Pex doesn't care if the file is called that, only requires that we strip the comments before.

stuhood commented 2 years ago

I don't feel too strongly about it, but I'm not sure that stripping a comment is easier. A stashed field won't need to be stripped before feeding the serialization format to most tools (assuming that their serialization doesn't explode for unknown fields, which is common).

thejcannon commented 2 years ago

So:

Short term: stuff a header in the PEX JSON and call it a day
Medium (to long) term: https://github.com/pantsbuild/pants/issues/14281

Bueno?

jsirois commented 2 years ago

That sounds like the consensus, yeah. You've left the "stuff" ambiguous though. There are 2 proposals there, original # comment style header or new idea of a top-level item in the JSON dict. This one boiled down to caring about being able to use the lock file directly (Pants not getting in the way) or not. I think hashing the importance of that out is down to you and Eric wrestling.

pantsbuild / pants