doctrine / couchdb-odm

A Document Mapper based on CouchDB
http://www.doctrine-project.org
MIT License
150 stars 53 forks source link

Inconsistent serialization of embed-many #100

Open SteveTalbot opened 10 years ago

SteveTalbot commented 10 years ago

Sometimes an embed-many mapping is persisted to CouchDB as a JSON array; sometimes it is persisted as a JSON object.

When you first add an array of objects to a document, the document appears in CouchDB as the Doctrine documentation suggests it should. However if you later remove one of these objects and update the document, the relationship is persisted differently.

Using the example at http://docs.doctrine-project.org/projects/doctrine-couchdb/en/latest/reference/association-mapping.html#embedmany, this is how the document appears in the database when the objects are first added:

{
    "_id": "1234",
    "phonenumbers":
    [
        {"number": "+1234567890"},
        {"number": "+1234567891"},
        {"number": "+1234567892"}
    ]
}

And this is how it becomes when you remove one:

{
    "_id": "1234",
    "phonenumbers":
    {
        "0": {"number": "+1234567890"},
        "2": {"number": "+1234567892"}
    }
}

This happens because the embedded document serializer maintains the key-value mapping, and the behaviour of PHP's json_encode is inconsistent for arrays with integer keys. If the array has consecutive integer keys starting at zero, json_encode produces a JSON array. So for example:

$a = array("x", "y", "z");
var_dump(json_encode($a));

produces:

string(13) "["x","y","z"]"

If the keys do not start at zero, or are not consecutive, json_encode produces a JSON object. So for example:

$a = array("x", "y", "z");
unset($a[1]);
var_dump(json_encode($a));

produces:

string(17) "{"0":"x","2":"z"}"

The inconsistency can cause problems with map-reduce functions that need to traverse the array.

To fix this inconsistency, I believe the embedded document serializer should only maintain the key-value mappings when the array has non-integer (string) keys. If the array only contains integer keys, the keys should be renumbered to ensure json_encode always produces a JSON array.

Edit: The same problem occurs when using a "mixed" mapping for an array of strings, but we're already working around this by defining two custom types. Our "array" mapping forces a PHP array to be represented in CouchDB as a JSON array, and our "hash" mapping forces a PHP array to be represented as a JSON object.

guillaumek commented 9 years ago

yep just got into that problem really annoying, custom type sounds like a nice workaround. Cheers