Open ericelliott opened 10 years ago
Reason for storing count Storing count information is necessary. If no count were stored, the count will always be 1 and that doesn't make it a counter.
Reason for storing process id
Storing process id in a state file is purely an assumption on my part which based on the name fingerprint
. Just like fingerprint that can be used to identify a person, shouldn't it be able to identify (sort of) a cuid generating client. Thus, shouldn't it be consistent for a period of time?
If it is just another random string which is generated by the client, I can remove them.
Reason for locking file Locking the file help prevents concurrent access to the same file hence messed up the data that the file retain.
Storing count information is necessary. If no count were stored, the count will always be 1 and that doesn't make it a counter.
That doesn't make sense. The count is incremented every time you call cuid(). The count will only always be 1 if you only call cuid() once per process -- and that should be fine even if you have many connections, because each process should generate its own fingerprint. If the process fingerprint is not unique enough to prevent collisions, work on improving the fingerprint implementation for this port.
Failing to do so risks slowing down every connection do deal with the shared counter state. This was never the intended implementation of CUID. They should be fast, synchronous, and non-blocking.
Storing process id in a state file is purely an assumption on my part which based on the name fingerprint. Just like fingerprint that can be used to identify a person, shouldn't it be able to identify (sort of) a cuid generating client. Thus, shouldn't it be consistent for a period of time?
The OS manages pids for you. There is never more than one process sharing the same PID at the same time. Even using the least significant bits of the PID (NOT the MSB's which change slower), Between PID and random number, you should have enough entropy to avoid collisions.
You should write uniqueness tests that ensure that you can generate millions of sequential CUIDs without collisions.
I notice in your test directory, you're testing that they are generating strings made up of the appropriate characters, but you're not creating uniqueness tests. You should fix that, and make sure that you are forking lots of processes and testing that the generated IDs don't collide.
The CUID instances (even across different PIDs) should not need any shared state, so no locking should be necessary. This is really the whole point of CUIDs, because you should be able to safely generate them not just on one machine, but on many different machines at the same time without worrying about collision.
You can't do that if you're relying on shared state.
Ok, that's make sense. I'll work on it and reply this when I'm done.
Thanks for the clarification.
:thumbsup:
Ok, I have removed the need to persist state in text file. It should now work the same as the original counterpart of Cuid. I have also added a uniqueness test for both normal cuid and slug generation. Currently I only test 50k succession generation because of the timeout limit at travis.
I did try to test it with 2 million successive generation with no collision detected.
*Update: Slug fails the uniqueness test In php 5.4, it fails in 47746/50000 iteration In php 5.5, if fails in 29167/50000 iteration
@endyjasmi So what was the end result. I'm assuming it is still persisting to disk?
@endyjasmi
If PHP really creates a new process, does that new process have a different PID? If that's the case, you don't need to store the count state, because the PID will differentiate between the processes, and IDs should not collide...
Having new .pids for each process should be fine, as long as those .pids don't collide with each other. Do you have tests that indicate that it's really a problem?