google / openrtb

OpenRTB model for Java and other languages via protobuf; Helper OpenRTB libraries for Java including JSON serialization
Apache License 2.0
399 stars 190 forks source link

JSON parser enters infinite loop #10

Closed vkorenev closed 9 years ago

vkorenev commented 9 years ago

The following JSON causes the bid request parser to enter infinite loop.

{
    "id": "8652a8680db33faabbf3fa76150f35df50a67060",
    "imp": [
        {
            "id": "121-dt1",
            "banner": {
                "h": 250,
                "w": 300,
                "pos": 1
            },
            "bidfloor": 0.05
        },
        {
            "id": "121-dt2",
            "banner": {
                "h": 728,
                "w": 90,
                "pos": 0
            },
            "bidfloor": 0.12
        }
    ],
    "site": {
        "id": "15047",
        "domain": "dailymotion.com",
        "cat": "IAB1",
        "page": "http://www.dailymotion.com/video/xxeauj_www-dramacafe-tv-hd-yyyy-yy-yyyyyyy-2012-yyyy_shortfilms",
        "publisher": {
            "id": "8796",
            "name": "dailymotion",
            "cat": "IAB3-1",
            "domain": "dailymotion.com"
        }
    },
    "user": {
        "id": "518c3da3717203f34019b038"
    },
    "device": {
        "ua": "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; GTB7.4; (R1 1.6); SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0)",
        "ip": "123.145.167.189"
    },
    "at": 1,
    "cur": [
        "USD"
    ]
}

It does not fully conform to OpenRTB 2.2/2.3 specification: site/cat and site/publisher/cat attributes must be arrays of strings, not strings. It was taken from here: https://github.com/openrtb/examples/blob/master/brandscreen/example-request-pc-multi.json

opinali commented 9 years ago

I guess you're using this non-conforming JSON because some exchange uses that? There are some OpenRTB fields that were poorly specified and later changed/clarified, but I don't remember that Site.cat or Publisher.cat were among those.

Anyway, I never worried to support uncompliant JSON and I'd expect a case like this to result in some error; but an infinite loop is not an acceptable failure mode. Let me try improve this.

opinali commented 9 years ago

I see you've got this sample JSON from a SpotXchange sample, I wonder if it's an obsolete or incorrect sample? Even if not, worst-case it's easy to modify the JSON serializer to handle the cat fields as scalar (I guess with CSV internally to allow multiple elements).

Doing that without a fork is not very clean, e.g. you can extend OpenRtbJsonReader and override readSite(), but then you have to copy that entire method to only change the parsing code for one field. Same for the writer. I could refactor these methods in two pieces, allowing to override only a readSiteFields() method where you'd only handle one field and super-call for others. Will consider this as a future improvement.

Alternatively, I could make the serializer "smarter", allowing a single value without [] in lieu of an array, but this is only trivial to implement for arrays of scalar values. If the array may contain objects, it's more complicated. On top of that I don't think it's a good idea to support what is effectively an uncompliant extension of OpenRTB... unless this kind of extension turns out to be common so people need to customize the serializer for multiple exchanges.

opinali commented 9 years ago

Fixed in commit 380f29b7fec31391d725fa2aaa463372e62e5eb8