adriank / ObjectPath

The agile query language for semi-structured data
http://objectpath.org
MIT License
380 stars 93 forks source link

Problem with unicode string in comparison #30

Closed ddidier closed 9 years ago

ddidier commented 9 years ago

I have a use use case where I must filter data on some values which contain a leading slash (ie. "/tmp"). This data is loaded from a JSON file so it ends up in unicode and I cannot find a way to filter this. Here is an example:

    data = {
        "store": {
            "book": [
                {
                    "category": "/fiction",
                    "title": "str /fiction"
                },
                {
                    "category": u"/fiction",
                    "title": "unicode /fiction"
                },
                {
                    "category": "fiction/",
                    "title": "str fiction/"
                },
                {
                    "category": u"fiction/",
                    "title": "unicode fiction/"
                },
            ]
        }
    }
    tree = Tree(data)
    tree.execute("$.store.book[@.category is '/fiction']")    # ==> 1 match ("str /fiction")
    tree.execute("$.store.book[@.category is 'fiction/']")    # ==> 2 matches

Thanks

adriank commented 9 years ago

I can't check. It now but try escaping it '//fiction'. On Apr 28, 2015 5:32 PM, "David DIDIER" notifications@github.com wrote:

I have a use use case where I must filter data on some values which contain a leading slash (ie. "/tmp"). This data is loaded from a JSON file so it ends up in unicode and I cannot find a way to filter this. Here is an example:

data = {
    "store": {
        "book": [
            {
                "category": "/fiction",
                "title": "str /fiction"
            },
            {
                "category": u"/fiction",
                "title": "unicode /fiction"
            },
            {
                "category": "fiction/",
                "title": "str fiction/"
            },
            {
                "category": u"fiction/",
                "title": "unicode fiction/"
            },
        ]
    }
}
tree = Tree(data)
tree.execute("$.store.book[@.category is '/fiction']")    # ==> 1 match ("str /fiction")
tree.execute("$.store.book[@.category is 'fiction/']")    # ==> 2 matches

Thanks

— Reply to this email directly or view it on GitHub https://github.com/adriank/ObjectPath/issues/30.

ddidier commented 9 years ago

Thanks but

tree.execute("$.store.book[@.category is '//fiction']")

returns nothing

ddidier commented 9 years ago

as a workaround, I convert the JSON using the following snippet of code found on the Internet but that's not very satisfactory...

def byteify(input):
    if isinstance(input, dict):
        return {byteify(key): byteify(value) for key, value in input.iteritems()}
    elif isinstance(input, list):
        return [byteify(element) for element in input]
    elif isinstance(input, unicode):
        return input.encode('utf-8')
    else:
        return input
adriank commented 9 years ago

This is very strange issue. I couldn't fix this in reasonable time. I'm focusing on JavaScript version right now. Python has it's issues and ObjectPath has many workarounds (way too many!) to make it work. If your solution works, it may be the best way to go and probably I would end up with similar code.