Closed redsolar closed 14 years ago
a uuid wrapper sounds like a good option, at least either the pecl or ossp (http://pwet.fr/man/linux/fonctions_bibliotheques/ossp/uuid) modules can be opted for, or custom uuid generators properly interfaced.
have you used the shapeshifter library successfully? It was fine until trying to convert back from binary uuid loads to their string equivalents.
OSSP seems like it requires calling a CLI or using a C API? I'd rather not go there.
Shapeshifter worked well for me (tested it compared to PECL UUID). The only major drawback, of course (besides speed), is non-strict RFC compliance, which I solved by adding a MAC address option to the generator method, which packs a (known beforehand) MAC address as a 6-byte string and passes it to Shapeshifter's UUID::generate in place of the 3rd parameters. A little crude (alas, I don't know a good way to find MAC from PHP without calling exec which I avoid), but we use PECL UUID on all our clients, so it's really really a fallback for us, more so than a main generator
Regarding the to/from binary conversion - not sure what you mean. In our environment, I wrote a custom packer/unpacker for cassandra/human readable and back conversion, along something like this:
/**
* Losslessly packs a standard UUID into a 16-byte string representation, safe for inserting into cassandra via thrift
*
* @param string $uuid UUID, of any type
*
* @return string
*/
private function __uuidToString($uuid)
{
//perform only basic validation of UUID style
if(!$this->__validateUUID($uuid, false))
{
throw new Exception("UUID '$uuid' does not follow a standard UUID format, aborting...");
}
//remove all the dashes
$reduced_uuid = str_replace("-", "", $uuid);
return (pack("H*", $reduced_uuid));
}
The result is safe for cassandra insertion (and preserves the ordering)
And the inverse method:
/**
* Converts a 16-byte string back into a standard UUID
*
* @param string $string 16-byte string representation of a UUID
*
* @return string
*/
private function __stringToUUID($string)
{
if(strlen($string) !== 16)
{
throw new Exception("Binary string that will be converted into a UUID must be exactly 16 characters (bytes) long");
}
$unpacked_string = unpack("H8a/H4b/H4c/H4d/H12e", $string); //array indexed by a/b/c/d/e of uuid segments
return(implode("-", $unpacked_string));
}
Using those 2 on the fly to convert column names for inserting/reading to/from cassandra removes any need for a third party binary packer dependence.
And the validation method too
/**
* Checks if UUID is valid, returns true if so
*
* @param string $uuid UUID to check
* @param bool $strictuuid If true, perform a strict validation according to RFC (requires uuid_is_valid function from PECL UUID), otherwise do a simple preg_match
*
* @return bool
*/
private function __validateUUID($uuid, $strictuuid = false)
{
if($strictuuid AND !function_exists("uuid_is_valid"))
{
throw new Exception("Strict UUID checking is specified, however, no known method to check exists. Make sure 'uuid_is_valid' function is present, and compile PECL's UUID extension if not");
}
else
{
if(function_exists("uuid_is_valid"))
{
return(uuid_is_valid($uuid));
}
else
{
//do a simple case-sensitive preg match on the sequence following UUID structure
//all UUIDs regardless of type, follow '12345678-90ab-cdef-1234-567890abcdef' template
return (preg_match("/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}/", $name) === 1);
}
}
}
neat, thanks :)
the ossp uuid library has a php module wrapper, basically 1:1 vs the C API including its pass by reference import/export. Very fast, but minus the pecl uuid helper functions (like uuid_is_valid, parse/unparse for example). I'll figure out a happy medium between all these options.
Ah, great, I checked it out - looks very similar to PECL, and better documented - I had to read xml files to PECL UUID to get the function info :).
It should work just as well, if not better then.
On the note of high performance, there is a very nice PECL UUID generator, considerably better featured than PHP UUID class.
While PHP UUID is very useful for those who don't have/can't use PECL (and for pure class reasons), performance dictates otherwise, especially if we need to generate many UUIDs quickly and with very low system overhead.
Installation: http://pecl.php.net/package/uuid Get the latest file, untar
cd uuid-1.0.2 phpize make make install cp /usr/lib64/extensions/no-debug-non-zts-20090626/uuid.so /usr/lib64/php/modules (for Centos x64, will differ by distro)
add uuid.so to /etc/php.ini or /etc/php.d/uuid.ini
after that, fork out UUID generation to if(function_exists('uuid_create')) { //requires PECL's uuid compiled and added to php.ini //http://pecl.php.net/package/uuid //fully compliant with RFC by using mac address $uuid = uuid_create(1); } else { //uses a PHP UUID class from //http://www.shapeshifter.se/2008/09/29/uuid-generator-for-php/ }
Besides pure performance differences, using PECL class also generates strictly compliant Type 1 UUIDs, which use timestamp and MAC addressin generation (latter can be quite cumbersome to get in PHP directly without system calls)
In addition, PECL class provides a few very nifty functions, such as uuid_timestamp and uuid_mac to respectfully retrieve timestamp and generator's mac address from a type 1 uuid, as well as uuid_is_valid() and uuid_is_null methods, which allow one to validate uuids without using preg_match (again, for performance reasons).