jolicode / castor

🦫 DX oriented task runner and command launcher built with PHP.
https://castor.jolicode.com
MIT License
414 stars 22 forks source link

Cache Key Collision with Identical fingerprint IDs in Different Applications #511

Closed TheoD02 closed 1 month ago

TheoD02 commented 2 months ago

Hey 😄

I've noticed a potential issue with the cache key generation for fingerprints after the release of v0.18.0 and the changes introduced in PR #494

Description

When using the fingerprint function in different applications with the same ID, the cache items are being overridden. This occurs because the ID is suffixed with .fingerprint, causing a collision in cache keys if the same ID is used across different applications.

Example Scenario

Consider the following scenario:

App A:

fingerprint(fn () => ..., 'id-docker-builder');

App B:

fingerprint(fn () => ..., 'id-docker-builder');

In this case, both applications generate a fingerprint with the same cache key: id-docker-builder.fingerprint. As a result, the cache entry created by App A could be overridden by App B, and vice versa.

Expected Behavior

Ideally, the fingerprint cache should be scoped or namespaced in such a way that identical IDs used in different applications do not collide and override each other. Each application should maintain its own separate cache entries to avoid conflicts.

Proposed Solutions

  1. Namespace the Cache Key by Parent Folder Name: Use the parent folder name as a prefix to the cache key to differentiate between applications. This would ensure that cache keys do not collide if different applications use the same ID.

  2. Generate a Unique Hash on Application Start: Create a unique hash when the application starts for the first time (similar to .castor.stub.php). Store this hash indefinitely in the cache with a random UUID. This unique identifier would then be used to differentiate cache keys between applications.

joelwurtz commented 2 months ago

I'm not sure if we want to do this, there may be cases when having a shared cache key between different castor project could be nice (like caching some global values for a tool that is used in various projects)

For me it's more a documentation problem where we should specify that cache key are not scope to a project and so the user should be aware of that

TheoD02 commented 2 months ago

I'm not sure if we want to do this, there may be cases when having a shared cache key between different castor project could be nice (like caching some global values for a tool that is used in various projects)

For me it's more a documentation problem where we should specify that cache key are not scope to a project and so the user should be aware of that

I understand your point, but wouldn't it be clearer to add an explicit parameter indicating that this specific fingerprint/ID will be shared across the global cache, rather than just at the current project level? I think that in most cases, the fingerprint will be at the application level, not the global level

e.g.:

/**
 * @param bool $useGlobal pass true to have a variable that will be available across castor projects
 */
function fingerprint(callable $callback, string $id, ?string $fingerprint = null, bool $force = false, bool $useGlobal = false): bool
joelwurtz commented 2 months ago

Why not, what do you think about this poke @lyrixx @pyrech ?

lyrixx commented 2 months ago

When I added the cache feature to castor I made the following choise:

https://github.com/jolicode/castor/blob/8dc9ad131e14f256db868e6c49e8791551ab49bc/src/functions.php#L407-L421


So here, I would vote to have a fingerprint function that is automatically scoped by default

lyrixx commented 1 month ago

@TheoD02 Would you mind to submit a PR according to my last comment? thanks

TheoD02 commented 1 month ago

Yep sure, go with this signature ?

function fingerprint(callable $callback, string $id, ?string $fingerprint = null, bool $force = false, bool $global = false): bool