composer / packagist

Package Repository Website - try https://packagist.com if you need your own -
https://packagist.org/
MIT License
1.75k stars 476 forks source link

Add API+button to archive a tag #938

Open nicolas-grekas opened 6 years ago

nicolas-grekas commented 6 years ago

It was recently identified that Composer consumes high CPU + memory on packages that have a lot of historical tags. See e.g. https://github.com/composer/composer/issues/7577 for some numbers + pointers.

This means the composer+packagist infrastructure has a scalability issue: as time passes, the list of tags per packages grows, and the "Composer experience" degrades. This is significant for symfony/* today, and will become also a pain for any other packages over time.

It would be great to just remove old tags from the provider jsons sent to composer (e.g https://repo.packagist.org/p/symfony/security-http.json should not list older tags.)

I think the most flexible way to achieve this would be to allow package authors to mark tags as "archived". If such feature existed, we would use it immediately for all older Symfony tags (e.g. <2.7) and everyone would benefit from it.

The outcome would be that composer could not resolve these old tags when solving dependency graphs. But composer.lock would still work so old projects could still be installed with no issues.

rdohms commented 6 years ago

Would it be an idea to make it more "lazy loading"? Composer can try to identify most used tags (this can be done either easy or complex :D) and only return those by default, if the runtime has missing tags it requests an "extended" info json that then returns everything?

or even couple this with the archive feature you described.

nicolas-grekas commented 6 years ago

This proposal needs only one "simple" change on one single side of the Composer+packagist duo: once implemented on packagist, no need to upgrade to next version of Composer to benefit from it. That's an important aspect of it. Of course we can also wonder about a new protocol between both components, but that's another story...

rdohms commented 6 years ago

@nicolas-grekas yes but it penalizes anyone using old versions, which can be avoided by a composer patch that does the extra fetching for archived options. I feel DX is something important as well.

nicolas-grekas commented 6 years ago

Should we assume old projects will use new versions of Composer? I doubt it :) But what you're proposing could be implemented as a second step actually: first update packagist with this proposal, then make it's API expose archived tags somewhere, then update Composer to fetch archived tags with some logic you're describing. Makes sense?

rdohms commented 6 years ago

I'm actually trying to make no assumptions.

It does make sense, but still poses that window of breakage, not sure why we would embrace that risk if we can do both solutions in parallel and avoid causing further breakage.

I approve of the archive solution, but would love to see it coupled with a fallback strategy, from day 0. But that's @naderman and @Seldaek call.

benja-M-1 commented 6 years ago

Should we assume old projects will use new versions of Composer? I doubt it :)

Why not? As composer can be easily installed as a phar executable, developer working on old projects may have installed newer versions of composer to benefit from performance improvements for example.

rubenrua commented 6 years ago

-1

Really don't like breaking old projects

See: https://twitter.com/nicolasgrekas/status/1032551128121200641

nicolas-grekas commented 6 years ago

Nothing is broken when composer.lock is committed. And these projects are already broken in another way, by having e.g. documented security issues. It's like WordPress: they don't like breaking old projects that work on PHP 5.2. so they still keep compatibility with it. At some point, the cost of such decisions is too high to be ok. For Symfony that's solved with flex, but for the rest...

naderman commented 6 years ago

It's a reality that projects using very old libraries exist and run today. These do not necessarily contain insecure code. Making it impossible for people to migrate and upgrade these applications one dependency at a time is a disservice to the PHP community. Many PHP developers work on older legacy applications and not in new greenfield projects using the latest shiniest framework. Adding this option may result in especially inexperienced library authors archiving most but the newest tags, we have no control over what gets archived, and whether these are responsible decisions.

For these reasons I would avoid merging any functionality to this effect.

curry684 commented 6 years ago

Really don't like breaking old projects

You wouldn't have to. Packagist knows both about release dates and install counts. If it were to separate metadata into "all tags" and "intersection of most popular 5 tags and 3 most recent releases" or something like that we could separate composer update in a first step only feeding the "active list" to the SAT solver, and if that fails retry with the full monty. It would possibly speed up 99% of active projects a lot and only impact old projects a tiny bit since a dependency on symfony/symfony:2.7.* would fail within a second if there are no active tags matching that. They'd effectively be just as fast.

It'd be a variant of hill climbing, and I've been suspecting for long that Composer could be an awful lot faster if it were to switch to true hill climbing instead of using unweighted SAT as it is much more likely to find an acceptable solution methinks in most Composer scenarios, unlike OS package management.

ryanotella commented 5 years ago

How about some sort of tag depth parameter? Perhaps a bit like git shallow clones. It might default to something reasonable and quick, but be easily bumped up to something equivalent to the current behaviour. Managing composer upgrades with complex package requirements can already be quite challenging. I think that this potentially adds additional failure possibilities for those developers who need the most predictability.

I don't think it is the business of composer to compel adoption of newer versions. I think "archiving" tags breaks the contract that makes composer work so well. What does it even mean to archive a tag? If it had real meaning or a benefit beyond performance, then breaking the contract might be justified, but I don't see what else it achieves.

leofeyer commented 5 years ago

@Toflar has a good idea how to implement this in a completely optional way and without having to mark several hundred tags as archived. Check out https://github.com/composer/composer/issues/8272 for details.