drupal-composer / preserve-paths

A composer plugin for keeping specified files and directories when installing/updating new composer packages. Can be used to supported nested packages.
28 stars 28 forks source link

Use uniqid() to avoid collisions #19

Closed colinodell closed 7 years ago

colinodell commented 7 years ago

We use this plugin in projects which are tested via Gitlab CI. Some of these projects test multiple versions of PHP simultaneously. To achieve better performance and keep bandwidth usage low, our runners use a shared Composer cache.

It's extremely likely we'll have two composer install commands start (and be running) at the same second with the same composer.json contents. As a result, the "unique" path used by the plugin ends up being the same in both runners. This leads to situations where one job is adding/deleting files and folders currently being used by the other one, causing it to fail.

I therefore propose using a different method to generate a unique path which is not based on the time in seconds. random_bytes() uniqid('', true) seems to fit this need nicely.

derhasi commented 7 years ago

Would it be sufficient to use time() . '_' . rand() instead? This way we would not need any new dependency. From my point we do not need cryptographically secure values, so rand() should be safe too.

colinodell commented 7 years ago

Based on some quick Google searches, it looks like rand() is seeded using the current time, meaning that two parallel executions could potentially generate the same numbers:

mt_rand() also uses the process id to seed itself. However, in my case, both process would be using the same exact pids, meaning they'd still generate the same random numbers.

My understanding is that random_bytes() will ask the kernel for random data; since both of my Docker containers share the same kernel (instead of running identical copies with identical states) they should in theory get different results, thus providing uniqueness.

The paragonie/random_compat dependency is really only needed for PHP 5.x which doesn't have random_bytes(). I agree that adding dependencies shouldn't be done lightly, but I don't see any better way to resolve this issue across both 5.x and 7.x. (If I'm mistaken please let me know! I'm definitely open to better solutions)

colinodell commented 7 years ago

I posed the question on Twitter and it sounds like uniqid('', true) might be viable if we can confirm that the $more_entropy option doesn't use a UNIX timestamp or pid. Trying to confirm this...

colinodell commented 7 years ago

Actually, it looks like uniqid('', true) works well enough! It uses the time (in microseconds) plus usleep(1) call which seems sufficient enough to create distinct identifiers. I've updated my PR accordingly.

derhasi commented 7 years ago

@colinodell thanks for the thorough research and the adjusted PR. 👍 Merged it :)

paragonie-scott commented 7 years ago

@derhasi

This way we would not need any new dependency.

Is there any particular reason why you're adverse to dependencies, especially when they solve the problem more effectively than you need?

derhasi commented 7 years ago

@paragonie-scott I am not adverse to adding a new dependency. I simply wanted to avoid adding one, when there is a simple solution with using the existing dependencies and without adding any additional complexity.

I think in this case there is no need for cryptographically secure values, so using random_bytes() seemed to me to be overkill, and therefore I did not see any added value in using a polyfill. In other use cases where cryptographically secure values might matter, I would not hesitate to add the dependency, as this would be a crucial part.

colinodell commented 7 years ago

My understanding is that uniqid() greatly reduces the chance of a collision (so it's definitely a step in the right direction) but random_bytes() would practically eliminate it.

I'll let you know if I encounter any future issues - if so, random_bytes() might be the only viable option.