then / promise

Bare bones Promises/A+ implementation
https://www.promisejs.org
MIT License
2.58k stars 312 forks source link

Make the build reproducible #146

Closed lamby closed 6 years ago

lamby commented 6 years ago

Whilst working on the Reproducible Builds effort [0], we noticed that node-promise could not be built reproducibly as it uses random numbers as throwaway identifiers.

This patch uses determinstic numbers instead.

This was filed in @Debian as https://bugs.debian.org/886277

[0] https://reproducible-builds.org/

Signed-off-by: Chris Lamb chris@chris-lamb.co.uk

edef1c commented 6 years ago

The identifiers are purposely not predictable from release to release, they should at least incorporate some version-dependent seed.

lamby commented 6 years ago

The identifiers are purposely not predictable from release to release

How come?

-- Chris Lamb chris-lamb.co.uk / @lolamby

ForbesLindesay commented 6 years ago

This is to guard against people using these internal properties. Alternative approaches like Symbols still have too much of a performance hit, so we just randomise it each time we do a new release, that way anyone who attempts to directly access internal state will find their module broken on each new release of promise.

I would be happy to accept something that seeded the random number generator using, say, the contents of all the files in src - that way it would change in an unpredictable way between releases, but the build would be reproducible providing the exact source was used.

lamby commented 6 years ago

@ForbesLindesay Cool, that kinda makes sense. I've updated the PR to accomodate this.

vovkasm commented 6 years ago

@ForbesLindesay

This is to guard against people using these internal properties. Alternative approaches like Symbols still have too much of a performance hit, so we just randomise it each time we do a new release, that way anyone who attempts to directly access internal state will find their module broken on each new release of promise.

Sorry for disturb, but why? Why this library needs to guard against people using these internal properties? Why is documentation warning not enough? In my opinion if one uses something that not part of the public interface, then... hm... well, that's his problem... Why the library should decide for users? Instead the library should clearly document it's public interface.

ForbesLindesay commented 6 years ago

Why this library needs to guard against people using these internal properties?

They're too tempting/convenient. If they weren't well hidden, people would use them. Most people using this would probably (incorrectly) decide it's worth the risk of breakages in the future to have a little convenience now. By rotating these properties at random, we make sure that you feel that pain sooner rather than later. Lots of libraries on npm depend on internal features of other libraries, sometimes that's not too big a deal. In this case, that would be disastrous, so we prevent it.

ForbesLindesay commented 6 years ago

I'm going to close this until a reasonable justification for the feature request is provided. Making it possible to verify our build, I would support, but this doesn't do that. Helping people who just want to waste electricity reproducing work we've already done for them, I do not understand.

lamby commented 6 years ago

Making it possible to verify our build, I would support, but this doesn't do that.

Hm, I think one of us has some confusion about the goal of reproducible builds. Indeed, this change would surely make it possible to verify your build, ie. from the same sources we get the exact same outputs. :)

ForbesLindesay commented 6 years ago

No. This change would make it possible to verify a build, but not our build. We wouldn't be setting the SOURCE_DATE_EPOCH environment variable, so you wouldn't be able to reproduce our build. What you're asking for is the ability to create a build that can be verified, which is pointless unless we are going to create a verifiable build. There's never any point being able to verify your own build in this way.

lamby commented 6 years ago

This change would make it possible to verify a build, but not our build

True, and apologies as I now see you already wrote that. However I would disagree with

There's never any point being able to verify your own build in this way.

.. in that downstreams (eg. Debian) are building this library and distributing binaries/builds of it. There is great value in being able to reproduce this result.

ForbesLindesay commented 6 years ago

Debian should not be building this library from source. It is a waste of electricity and time. I am happy to make a firm commitment to never helping them do something that wasteful.

lamby commented 6 years ago

Debian should not be building this library from source

Please could you elaborate why in more detail? Debian is a free software distribution - simply using upstream's provided artefacts as-is does not align with our goals of letting our users (or downstream distributions) make modifications to the code which — naturally — requires that binaries are built from their sources.

P-EB commented 6 years ago

@ForbesLindesay how can one be sure that the build he has is from you and not altered by a third party if it's not reproducible from sources?

ForbesLindesay commented 6 years ago

@P-EB Exactly, if @lamby was asking us to make our build reproducible, that would be fine. That would allow him to verify our published build against our sources. What @lamby is asking for does not let him reproduce our build. It lets him create a separate, reproducible build. This has no value.

Please could you elaborate why in more detail

This is already open source software. If people want to modify our promise implementation, they can fork it here, publish a new package, and build a fork of debian that points at that forked promise implementation. This is already how the entire node ecosystem works.

Modifications require that it is possible to go and build something from source, doing so if you're not making modifications is pointless.

lamby commented 6 years ago

I grant the distinction between your builds and the builds of others, but putting aside the issue of validating that your published build matches its sources, I'm afraid I simply don't follow why you think reproducing a separate build is of no value. :)

P-EB commented 6 years ago

It lets him create a separate, reproducible build. This has no value.

Should your build be reproducible then this question wouldn't exist at all. The fact that it isn't also means that someone taking your blobs and your codebase can't verify that everything went well.

So, may we agree that there is an issue in the current situation of your codebase that requires to be fixed by putting deterministic identifiers (the fact that they change across releases is something simple to achieve)?

ForbesLindesay commented 6 years ago

@P-EB Yes, as I've always maintained, I'm happy to accept a pull request that has this outcome. I have rejected this one because it does not.

@lamby I'm tired of trying to explain this. The burden of proof is on you to provide an example of a benefit that results from being able to create a reproducible build of a library, but not being able to reproduce the official build. It is not an end unto itself; you must provide an example of a positive benefit to this. I cannot conclusively prove it is useless, because there could always be a use case that I have yet to think of, but I haven't been able to come up with one, and you've had plenty of time and opportunity to provide one. You are asking me to review, and then maintain, code that does something that I see no evidence needs doing.

lamby commented 6 years ago

I'm missing something then as, to me, the use case has been explicit and implicit throughout the PR - Debian is shipping this library, building it from source. (Not that it makes much difference but FYI you certainly hold the minority opinion in the free software community on this :p.)

ForbesLindesay commented 6 years ago

you certainly hold the minority opinion

I doubt it: npm had 753,071,751 downloads in the last day and almost no npm packages are set up to build from source on download.

This package alone had 586,003 downloads in the last day, none of those will be building from source, because if they were they would get it from GitHub.

As I've said:

  1. Debian should not be building this from source.
  2. The reason for wanting Debian builds to be reproducible is the same as the reason for wanting Promise builds to be reproducible - i.e. it lets people verify that you're not sneaking backdoors in by building from different source to the open source code. Despite this, you don't seem interested in solving that issue for the Promise library. You seem only to be interested in your very narrow use case, that could be better solved by simply using the pre-built code like 99.99% of users or this package.

I will accept a well written pull request to make our build reproducible, but I'm going to lock this issue as debating with you is not a productive use of anyone's time.

edef1c commented 6 years ago

I feel decently sure that we're not going to get Debian to change their policy (nor do I feel they should), and we could quite reasonably derive SOURCE_DATE_EPOCH ourselves if we wished to — I work on nixpkgs myself, and we set and rely on SOURCE_DATE_EPOCH since it is a de-facto standard for reproducible builds. On the nixpkgs end, we have no particular interest in reproducing anyone else's build either, just ensuring that running the same build twice always produces the same output. The argument that making our builds verifiable is somehow more helpful isn't clear to me: verifying our builds takes as much work/computation as doing one's own builds.

ForbesLindesay commented 6 years ago

Verifying our build has broad benefits, it lets you verify that the build everyone is using is correct. Building from source for yourself only verifies your own personal build.

Re changing Debian's policy - just because I can't fix what they're doing doesn't mean I have to support it. They're not paying me so I have no obligation to help them and I believe the approach they are taking is wasting people's time and computing power.

If this pull request included code to set SOURCE_DATE_EPOCH then I probably would have accepted it.