[WIP] Shrinkwrap feature

sheerun commented 9 years ago

This is WIP PR, do not merge. ~~You can send any work against this branch (even not fully working). It will be squashed at the end.~~ Please start work from master branch and send PR to master branch.

As mentioned in #505:

Lock file is named bower.lock
Lock file contains concatenated .bower.json files generated by bower (some subset of their keys)
On production bower installs non-development packages using endpoints from bower.lock
After installation it compares every .bower.json with entry in bower.lock. If something doesn't match, then source tampering is assumed, and the installation is immediately aborted.
For now the _resolution entry in .bower.json should be enough to prevent source tampering (as @schmurfy pointed out it's impossible to change source without changing sha of commit).
If in future we decide to implement full hashing, we'll just add it to .bower.json (there is no chicken-egg problem because .bower.json is generated by bower, not stored in source control).

For anyone who doesn't know how .bower.json looks like, here is a sample. The keys that begin with underscore are added by bower. All other keys come from bower.json of component.

rstacruz commented 9 years ago

What should bower.lock look like? is it anything like bower ls --offline --json --relative?

sheerun commented 9 years ago

Some subset of it. For example it cannot contain absolute paths, because they'd change between workstations. I think we don't really need nested structure too (or do we?).

We only need pkgMeta field from bower ls --offline --json --relative output.

I think we can skip those with "extraneous": true. They are not present in bower.json.

So:

Flatten output of bower ls --offline --json --relative
Leave only packages without "extraneous": true
Leave only pkgMeta field (map to it)
If some bower-generated field is an absolute path (like _source for local packages), convert it to relative path.
Wrap it in dependencies field. Maybe add version of lockfile and pkgMeta of original package. Sort dependencies by name.

{
  "version": 1,
  "pkgMeta": { ... },
  "dependencies": [
     ...
  ]
}

Please also try to test some bower features and judge if bower.lock looks OK.

Does it look OK for you?

sheerun commented 9 years ago

Equally good option is to fetch some subset of pkgMeta so we have more flexibility changing it later, so bower.lock stays compatible between bower versions. For POC pkgMeta is enough, tough. We'll change exact format before release.

sheerun commented 9 years ago

As for behavior:

bower.lock doesn't exist:
- bower install creates bower.lock file only after all-successful non-production installation
- bower install --production complains there's no bower.lock and exits
- bower update complains there's no bower.lock and exits
bower.lock exists dependenies in bower.json match those on bower.lock
- bower install installs exact packages from bower.lock
- bower install --production installs exact packages from bower.lock ignoring devDependencies
- bower update is no-op
- if installation would change bower.lock, it fails with error message
bower.lock exists and there are only extra dependencies in bower.json relative to bower.lock
- bower install tries to install bower.lock with only new packages from bower.json added. Installation fails if old bower.lock is not a subset of new bower.lock.
- bower install --production fails
- bower update --all re-creates bower.lock
bower.lock` exists and dependencies in bower.json changed relative to bower.lock
- bower install fails, and says you need to bower update <package> first
- bower install --production fails
- bower update complains there's no package specified to update
- bower update --all re-creates bower.lock
- bower update <package> tries to install bower.lock with <package> version replaced with version from bower.json. Installation fails if old bower.lock is not a subset of new bower.lock (except updated package).

I hope it's more or less clear, feel free to discuss this. If implementing, please test it :)

Please speak-up if someone wants to help except @rstacruz I can assign some tasks.

sheerun commented 9 years ago

@rstacruz Please use output of project.getTree as an input and test the implementation somewhat.

You can create like lib/util/lockfile.js file for now + tests for it (test/util/lockfile.js).

sheerun commented 9 years ago

Side note: it would be good if we changed cache storage format from plain files to .tar.gz. We could embed an sha of it in bower.lock and get tampering-resistance for free. But for now, just leave it.

mathroc commented 9 years ago

@sheerun what do you mean by "if installation would change bower.lock, it fails with error message" on 2nd point ?

anyway, I have suggestions about for the behavior of bower install/update

I think bower.lock should include dev packages definitions (but separate from other dependancies) and totally ignore bower.json otherwise. also, in my opinion, bower install should never modify bower.lock, only create it if does not already exists

that gives, when no package is specified:

bower.lock doesn't exist:
- bower install behave as bower update
- bower install --production complains there's no bower.lock and exits
- bower update installs all packages and create bower.lock after successful installation of all packages
bower.lock exists
- bower install installs all exact packages from bower.lock
- bower install --production installs exact packages from bower.lock except dev dependencies
- bower update update all packages and update bower.lock after successful update

and if packages are specified:

bower.lock doesn't exist or <packages> are not in bower.lock :
- bower install <packages> behave as bower update <packages>
- bower install --production <packages> complains that all packages should be installed at the same time with --production and exits
- bower update <packages> installs packages and create(or update) bower.lock after successful installation of all packages
bower.lock exists and <packages> are in bower.lock :
- bower install <packages> installs exact packages from bower.lock
- bower install --production <packages> complains that all packages should be installed at the same time with --production and exits
- bower update <packages> update package and update bower.lock after successful update

I might be wrong, but it seems that we are trying to make bower install do two different things: install (require) and install (install exact). maybe another command is needed ?

it could be:

composer install: add one or more packages to the dependencies
composer update: update one or more packages
composer deploy: install packages exact version

that way, older commands behavior don't need to change (except for creating/updating bower.lock) and composer deploy behavior is much more easier (install exact version of packages from bower.lock or fail)

sheerun commented 9 years ago

@mathroc It means that if someone changes e.g. tag on GitHub, installation fails.

After each bower install, bower checks if bower.lock would change. If so, something's wrong.

As for modyfying bower.lock, that's how bundler works. It's important to notice bower install can only add entries to bower.lock, while bower update can also change them.

I think bower update should error if bower.lock doesn't exist. Update only makes sense with bower.lock.

Making bundler update was bundler's mistake, and they try to amend it: https://github.com/bundler/bundler-features/issues/18 I want to use --all from the very beginning to avoid backward incompatibility issues bundler experiences.

I think bower install <packages> should behave differently from bower update <packages>. bower install <packages> is only shortcut for adding <packages> to bower.json and running bower install. As I mentioned bower install avoids editing bower.lock, except adding new entries. On the other hand bower update <package> doesn't modify bower.json in any way, and explicitly allows to changes to <package> in bower.lock.

I might be wrong, but it seems that we are trying to make bower install do two different things: install (require) and install (install exact). maybe another command is needed ?

I'm not sure what you mean. bower install is bower update that:

Can work without bower.lock
If bower.lock exists cannot modify bower.lock except adding new entries (in non-production)

When using bower install you'll be sure that already installed packages will be exactly the same.

I don't think we need to add new command.

@mathroc Does it make sense to you?

schmurfy commented 9 years ago

glad to see works on this, I think this is really needed !

As a ruby developer (backend) I really like bundler and I think using it as reference is a good choice except for "bundle update" as you mentioned which is error prone but when you use git/svn/whatever properly this should never be an issue, you can just revert the lock file.

If possible having an equivalent of bundle update <package1>, <package2>, ... would be nice too when you migrate to a new version for one particular library (or more) but don't want to change any of the others.

mathroc commented 9 years ago

@sheerun ok i think i got you point

I'm not used to bundler so I'm mostly comparing with composer. and with composer you have 3 commands:

require which add a package (currently install for bower)
update, same as in bower
and install that only works when .lock is present

the advantage to have the exact-install in composer is that it's very fast, it only download the .lock version of the package (with the commit id not the tag) so if the tag changed on github he does not care and still install the same version as before

checking (and failing) if the tag changed on github might break a deploy that was previously working. it also means that integration test might start failing just because a tag have been changed. is it what is wanted?

of course it should fail if the commit id does not exists at all and bower cannot download the package

sheerun commented 9 years ago

the advantage to have the exact-install in composer is that it's very fast, it only download the .lock version of the package (with the commit id not the tag) so if the tag changed on github he does not care and still install the same version as before

Two points:

In case of bower downloading tags is faster than commits because we can download .tar.gz
bower install will use bower.lock as much as possible. No time wasted on resolving.

checking (and failing) if the tag changed on github might break a deploy that was previously working. it also means that integration test might start failing just because a tag have been changed. is it what is wanted?

That's the point. We want to prevent deploying different code. It's better to fail for deploy than application.

sheerun commented 9 years ago

Of course targets like #master will be locked to commit rather than branch. Like in bundler.

mathroc commented 9 years ago

In case of bower downloading tags is faster than commits because we can download .tar.gz

well, you can also download the tar.gz for a commit id. why have different behaviors for tags & branches ? Is Bundler locking on tag rather than commit id? maybe that's just because I'm used to composer but I find that odd.

I don't get in which use case it makes more sense to fail the install --production because the tag has changed on the remote rather than keep installing the package that was installed when the bower.lock was generated. after all, that was the bower.lock is for, I don't care if the remote changed, I want the same code I had when developing. Or at least that's what makes sense to me :)

sheerun commented 9 years ago

I think I get what you mean.

well, you can also download the tar.gz for a commit id

Wow. I didn't even know it's possible :)

why have different behaviors for tags & branches

Branches are moving targets while tags are supposed to not be changing. Bower already treats tags and branches differently by caching tags and never caching branches. Tags are equivalent of gem versions in bundler. Also, there are plans to export tags to npm, where commit SHAs won't be available.

I think the ultimate reason is: If you use semver versions, we don't want to assume the underlying platform is VCS. Plus vendors other than GitHub not necessarily provide per-commit tarbars.

As for bower install --production: it should use bower.lock instead bower.json, so indeed, you are guaranteed the code on development will be the same as on production. If someone changes the tag on production, deployment will fail.

One way to detect it is to read metadata bower adds to .bower.json:

"_resolution": {
  "type": "version",
  "tag": "1.7.0",
  "commit": "da996e665deb0b69b257e80e3e257c04fde4191c"
},

Every time tag changes on server, the commit field will be different, and at the same time bower.lock will be different. bower install would fail if bower.lock turns out different, even when installing from bower.lock.

Ultimately there's a better way to detect it, though. Earlier I mentioned:

Side note: it would be good if we changed cache storage format from plain files to .tar.gz. We could embed an sha of it in bower.lock and get tampering-resistance for free. But for now, just leave it.

So for semver resolution we'd get instead:

"_resolution": {
  "type": "version",
  "tag": "1.7.0",
  "sha": "e1e2c3a4f5e6b7a8b9e0"
},

Where the sha is hash of downloaded .tar.gz tag archive. There absolutely no assumption there's VCS underneath, only one link to .tar.gz is needed. It can be mirrored, exported to npm, whatever.

It's just my idea though. If you think there's better way to handle it (introduce new type?) let me know.

schmurfy commented 9 years ago

For bundler I am pretty sure whether you use a branch or tag does not matter it will ultimately resolve to a commit sha1, be careful with how you handle the tags, while a commit id cannot be changed without altering the content the same can't be said with tags, I am sure tags be moved to another commit.

I cannot test that now but consider this:

push a remote tag
remove/ move it locally to another commit
force a server push with tags

Always relying on the resolved commit sha1 in lockfile seems safer to me, since the main requirement for me is to ensure nothing will be changed between development and production.

robwierzbowski commented 9 years ago

You can delete and recreate tags as much as you want. All endpoints should definitely resolve to shas.

imjacobclark commented 9 years ago

Getting this started by pushing boilerplate code to register the shrinkwrap command and tests.

See #1619

hannahhoward commented 9 years ago

omg... I have no comment on the content but thank you, thank you, thank you for moving forward with this! you guys rock and hope it will land soon.

schmurfy commented 9 years ago

I am also glad someones started on this, thanks ! :)

tmikoss commented 9 years ago

Eagerly awaiting this feature - lack of it is proving to be a roadblock of adoption in my team.

atipugin commented 9 years ago

+1 Version locking would be very useful

Wirone commented 9 years ago

+1

This is crucial, from my perspective it does not make sense without version locking. When more than one person works on project and each of team members has own development environment, there are test and production instances, it's not acceptable to install packages without version consistency. It's best way to keep programmer's stereotype alive: "Works for me!" ;-)

Thanks in advance for implementing this feature, I'm looking forward for it!

micahlmartin commented 9 years ago

Have any pull requests with actual implementation been submitted against this yet?

sheerun commented 9 years ago

@micahlmartin Not yet

karlingen commented 9 years ago

+1

hannahhoward commented 9 years ago

if someone wants to implement this, is there pre-existing work that should be used, or is it enough to read the specs in the comments of this doc and start from bower#master?

sheerun commented 9 years ago

@hannahhoward Unfortunately there is no pre-existing work. bower#master is place to start.

I can support anyone who decides to implement it. Even give write access to bower.

kodypeterson commented 9 years ago

@sheerun I think I am going to take a stab at this over the weekend and see what I can come up with!

sheerun commented 9 years ago

Uh. Rebasing of this branch automatically closed this PR, and I can't reopen it.. Seems like a Github bug.

I'm going to create another issue with detailed documentation.

mruoss commented 9 years ago

Or just write to support an ask to reopen it? Am 21.03.2015 05:09 schrieb "Adam Stankiewicz" notifications@github.com:

Uh. Rebasing of this branch automatically closed this PR, and I can't reopen it.. Seems like a Github bug.

I'm going to create another issue with detailed documentation.

— Reply to this email directly or view it on GitHub https://github.com/bower/bower/pull/1592#issuecomment-84247567.

andrewboni commented 9 years ago

+1 for this - definitely needed for my team

jfmercer commented 8 years ago

:+1:

karlingen commented 7 years ago

@sheerun 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉

sheerun commented 7 years ago

I've reopened it because github fixed the issue, but mind real PR is here: #1748

sheerun commented 6 years ago

Bower is now deprecated and we recommend to use Yarn or Npm that support file locking.

Here's the guide how to migrate: https://bower.io/blog/2017/how-to-migrate-away-from-bower/

bower / bower

[WIP] Shrinkwrap feature #1592