facebook / hhvm

A virtual machine for executing programs written in Hack.
https://hhvm.com
Other
18.17k stars 2.99k forks source link

HHVM incorrectly returns a "valid" (but not initialized) object from a broken serialization string #6032

Open derickr opened 9 years ago

derickr commented 9 years ago

I am working on a driver for MongoDB, mimicking the same API as one for PHP. In our extension, we have a couple of built-in classes, that we don't want to serialize (or inherit from). We can signal that by setting class handlers for serialize and unserialize:

    ce->ce_flags    |= ZEND_ACC_FINAL_CLASS;
    ce->serialize    = zend_class_serialize_deny;
    ce->unserialize  = zend_class_unserialize_deny;

zend_class_serialize_deny and zend_class_unserialize_deny both throw an exception in the case such an object are attempted to be serialized or deserialized:

ZEND_API int zend_class_serialize_deny(zval *object, unsigned char **buffer, zend_uint *buf_len, zend_serialize_data *data TSRMLS_DC) /* {{{ */
{
    zend_class_entry *ce = Z_OBJCE_P(object);
    zend_throw_exception_ex(NULL, 0 TSRMLS_CC, "Serialization of '%s' is not allowed", ce->name);
    return FAILURE;
}

I was attempting to do imitate the same behaviour with HHVM, by creating a trait:

trait NoSerialize {
    public function __sleep()
    {
        throw Exception("Serialization of '" . get_class($this) . "' is not allowed");
    }

    public function __wakeUp()
    {
        throw Exception("Unserialization of '" . get_class($this) . "' is not allowed");
    }
}

But the methods are never called, and HHVM insists on just throwing a warning: Warning: Attempted to serialize unserializable builtin class MongoDB\Driver\Manager in

This happens because of https://github.com/facebook/hhvm/blob/HHVM-3.9/hphp/runtime/base/object-data.cpp#L855

(In master, it is now at https://github.com/facebook/hhvm/blob/master/hphp/runtime/base/variable-serializer.cpp#L1517)

It will always do this, unless isCppSerializable is part of the class info flags, which I don't seem to be able to set myself.

So the question is really:

paulbiss commented 9 years ago

The way this API currently works requires that the native classes declare C++ sleep and wakeup functions. To do this declare functions named sleep and wakeup on the native-data struct for your class.

The sleep function takes no arguments and returns the Variant to be serialized, the wakeup function takes a context argument (Variant) and an ObjectData argument (the object being unserialized), and returns void.

You should be able to throw the exceptions directly fromt these functions.

derickr commented 9 years ago

I've tested this now, and implemented it as follows:

        void wakeup (const Variant& context, ObjectData* obj) {                                                                                                                      
            throw Object(SystemLib::AllocExceptionObject("Unserialization of MongoDB\\Driver\\Manager is not allowed"));                                                             
        }                                                                                                                                                                            

        Variant sleep() const {                                                                                                                                                      
            throw Object(SystemLib::AllocExceptionObject("Serialization of MongoDB\\Driver\\Manager is not allowed"));                                                               
        } 

This works fine, I get the exception, however, I also get an "improper" object.

Code:

<?php
$str = 'O:22:"MongoDB\Driver\Manager":0:{}';
var_dump(unserialize($str));

PHP does:

[PHP: 5.6.12-dev  ]
[HHVM: 3.9.0-dev]
derick@whisky:~/dev/php/10gen-labs-mongo-hhvm-driver-prototype $ php -n -dextension=mongodb.so /tmp/test.php

Warning: Erroneous data format for unserializing 'MongoDB\Driver\Manager' in /tmp/test.php on line 3
bool(false)

HHVM does:

derick@whisky:~/dev/php/10gen-labs-mongo-hhvm-driver-prototype $ hhvm -vDynamicExtensions.0=mongodb.so /tmp/test.php
object(MongoDB\Driver\Manager)#1 (0) {
}

An exception thrown in a constructor (or wakeup), should not result in a "valid" object, which subsequently crashes when using it, as it has not been setup correctly (hence the throwing of an exception):

Code:

<?php
$str = 'O:22:"MongoDB\Driver\Manager":0:{}';
$m = unserialize($str);

$rp = new MongoDB\Driver\ReadPreference(MongoDB\Driver\ReadPreference::RP_PRIMARY);
$m->selectServer( $rp );

PHP:

derick@whisky:~/dev/php/10gen-labs-mongo-hhvm-driver-prototype $ php -n -dextension=mongodb.so /tmp/test.php

Warning: Erroneous data format for unserializing 'MongoDB\Driver\Manager' in /tmp/test.php on line 3

Fatal error: Call to a member function selectServer() on boolean in /tmp/test.php on line 6

HHVM:

derick@whisky:~/dev/php/10gen-labs-mongo-hhvm-driver-prototype $ hhvm -vDynamicExtensions.0=mongodb.so /tmp/test.php
Segmentation fault

So how do I make it not return a "valid" object, but instead return "false", like PHP does?

paulbiss commented 9 years ago

Looking at the serialize/unserialize code I think what you actually want to do is throw an Exception object, as this is what we catch in order to raise a notice and return false in unserialize_ex.

throw Exception("Unserialization of MongoDB\\Driver\\Manager is not allowed");                                                             
derickr commented 9 years ago

Okay - there are actually two things at play here:

  1. The serialization and deserialization handlers. Your first hint, and my implementation: throw Object(SystemLib::AllocExceptionObject("Serialization of MongoDB\\Driver\\Manager is not allowed")); actually works just fine. I'll be adding that to the cookbook. Your second suggestion (throw Exception("Unserialization of MongoDB\\Driver\\Manager is not allowed");) makes HHVM throw a fatal error instead of a real exception, which I can't catch. I don't think this very useful.
  2. The deserialization of "broken" strings is where the real problem is, so let's focus on that. I have (I will) amend the title of this issue with this.

Code:

<?php
$str = 'O:22:"MongoDB\Driver\Manager":0:{}';
$m = unserialize($str);
var_dump($m);

PHP throws a warning, and returns "NULL" while deserialization of a broken string:

derick@whisky:~/dev/php/10gen-labs-mongo-hhvm-driver-prototype $ php -n -dextension=mongodb.so /tmp/test.php 

Warning: Erroneous data format for unserializing 'MongoDB\Driver\Manager' in /tmp/test.php on line 3
bool(false)

HHVM doesn't throw a warning, and returns an improperly initialised object:

object(MongoDB\Driver\Manager)#1 (0) {
}

Making use of this improperly initialized object makes things crash - because the constructor would have filled in some internal structures. This is what needs fixing in HHVM. It should ideally throw a warning, but certainly not return a pseudo-valid object.

paulbiss commented 9 years ago

So, the way you're doing things for serialize sounds right. What I don't follow is how throwing an Exception in unserialize is resulting in a fatal. We catch the two separately and an Exception should trigger a notice and return false when unserializing:

https://github.com/facebook/hhvm/blob/6cff345e2a713deb133509e4f189af3030c9eb8b/hphp/runtime/base/builtin-functions.cpp#L745-L753

derickr commented 9 years ago

I used GDB in anger this morning, and I found the following while going through https://github.com/facebook/hhvm/blob/6cff345e2a713deb133509e4f189af3030c9eb8b/hphp/runtime/base/type-variant.cpp#L865-L976

The code never bothers calling the wakeup() that I define in my NativeData struct. I do need to define it though, because serialization does call sleep and without wakeup defined, I get the following assertion:

hhvm: /home/derick/dev/facebook-hhvm/hphp/runtime/vm/native-data.cpp:53: void HPHP::Native::registerNativeDataInfo(const HPHP::StringData*, size_t, HPHP::Native::NativeDataInfo::InitFunc, HPHP::Native::NativeDataInfo::CopyFunc, HPHP::Native::NativeDataInfo::DestroyFunc, HPHP::Native::NativeDataInfo::SweepFunc, HPHP::Native::NativeDataInfo::SleepFunc, HPHP::Native::NativeDataInfo::WakeupFunc): Assertion `(sleep == nullptr && wakeup == nullptr) || (sleep != nullptr && wakeup != nullptr)' failed.

However, it does call the HHVM_METHOD function __wakeup (https://github.com/facebook/hhvm/blob/6cff345e2a713deb133509e4f189af3030c9eb8b/hphp/runtime/base/type-variant.cpp#L969-L973).

In the end, I have made it work with this:

class MongoDBDriverManagerData
{
    public:
…
        void wakeup (const Variant& context, ObjectData* obj) {
            std::cout << "NEVER GETS RUN\n";
            throw Object(SystemLib::AllocExceptionObject("Unserialization of MongoDB\\Driver\\Manager is not allowed"));
        }

        Variant sleep() const {
            throw Object(SystemLib::AllocExceptionObject("Serialization of MongoDB\\Driver\\Manager is not allowed"));
        }
<<__NativeData("MongoDBDriverManager")>>
class Manager {
    <<__Native>>
    function __wakeUp() : void;
}
void HHVM_METHOD(MongoDBDriverManager, __wakeup)
{       
    throw Object(SystemLib::AllocExceptionObject("Unserialization of MongoDB\\Driver\\Manager is not allowed"));
}

So there seem the be the following issues:

derickr commented 9 years ago

Can I please have a comment on both points?

paulbiss commented 9 years ago

The native wakeup function is only called if the serialized string contains a native-data key ("\0native"). On the second point, HHVM raises an notice and returns false when it catches an exception in unserialize, see the link below.

https://github.com/facebook/hhvm/blob/bbff70c6ccbff3ff19233c6ff1e4cedcf12b13f3/hphp/runtime/base/builtin-functions.cpp#L755-L759

derickr commented 9 years ago

| The native wakeup function is only called if the serialized string contains a native-data key ("\0native").

What's the point with that? How do I define a wake-up handler in my class then? Do I really have to use

void HHVM_METHOD(MongoDBDriverManager, __wakeup)

Even though there is a native method, which just doesn't happen to be called? I can't just add a "\0native" into my unserialized data to trigger the wakeup function on the ObjectData...

| On the second point, HHVM raises an notice and returns false when it catches an exception in unserialize, see the link below.

Yes, it does.

However, what it does not do is detect that the string: 'O:22:"MongoDB\Driver\Manager":0:{}' is not a valid serializable representation. In PHP, it triggers the code: https://github.com/php/php-src/blob/master/ext/standard/var_unserializer.re#L450-L468