adriank / ObjectPath

The agile query language for semi-structured data
http://objectpath.org
MIT License
380 stars 93 forks source link

query on temperature data failed #45

Closed titaniumrain closed 7 years ago

titaniumrain commented 8 years ago

Hi,

I have this sample json file as shown below

{"coord":{"lon":139,"lat":35}, "sys":{"country":"JP","sunrise":1369769524,"sunset":1369821049}, "weather":[{"id":804,"main":"clouds","description":"overcast clouds","icon":"04n"}], "main":{"temp":289.5,"humidity":89,"pressure":1013,"temp_min":287.04,"temp_max":292.04}, "wind":{"speed":7.31,"deg":187.002}, "rain":{"3h":0}, "clouds":{"all":92}, "dt":1369824698, "id":1851632, "name":"Shuzenji", "cod":200}

The query *$..[@..temp > 10].name** works

but the query *$..[@.icon is "04n"].name** doesn't work.

any insights?

adriank commented 8 years ago

Does $..*[@..icon is "04n"].name work for you?

titaniumrain commented 8 years ago

Nope it does not :/

titaniumrain commented 8 years ago

Hi Adrian,

I dig the problem a bit further. And kinda concluded (could be totally wrong though) that the root cause of the failed query is that '..' (combining with @) does not handle list type properly. An example is shown below

This query succeeded

*>>> $..[@.icon is '04n']** [{ "main": "clouds", "id": 804, "icon": "04n", "description": "overcast clouds" }]

This query failed

*>>> $..[@..icon is '04n']** []

See if it makes sense?

adriank commented 8 years ago

Ok, I've run your original query and it seems to work on my machine:

$..[@..temp > 10].name ["Shuzenji"] $..[@.icon is "04n"] [{ "main": "clouds", "id": 804, "icon": "04n", "description": "overcast clouds" }]

The problem is that, there is no name here. :)

Greetings, Adrian Kalbarczyk

http://kalbarczyk.co http://about.me/akalbarczyk

On Sat, Jul 9, 2016 at 1:51 PM, Yun Shen notifications@github.com wrote:

Nope it does not :/

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/adriank/ObjectPath/issues/45#issuecomment-231530616, or mute the thread https://github.com/notifications/unsubscribe/AAKycvQ-C0c-llgp2IhaYJziwyJ9qkS8ks5qT4tAgaJpZM4JIWm0 .

adriank commented 8 years ago

This is probably a bug. Run objectpath -d file.json and see what happens when you query it.

Greetings, Adrian Kalbarczyk

http://kalbarczyk.co http://about.me/akalbarczyk

On Sat, Jul 9, 2016 at 5:02 PM, Yun Shen notifications@github.com wrote:

Hi Adrian,

I dig the problem a bit further. And kinda concluded (could be totally wrong though) that the root cause of the failed query is that '..' does not handle list type properly. An example is shown below

>>> $..[@.icon is '04n']* [{ "main": "clouds", "id": 804, "icon": "04n", "description": "overcast clouds" }]

>>> $..[@..icon is '04n']* []

See if it makes sense?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/adriank/ObjectPath/issues/45#issuecomment-231538633, or mute the thread https://github.com/notifications/unsubscribe/AAKycqI72OMqeWZNVJOtJbhy65GQkYhMks5qT7gfgaJpZM4JIWm0 .

titaniumrain commented 8 years ago

Hmmmmmm, I used -d flag and the output seems to be fine? I listed the failed one first then the successful one below.

$..[@..icon is '04n'] START@43 Tree.execute PARSE STAGE ('[', ('..', ('(root)', 'rs'), ('',)), ('is', ('..', ('(current)',), ('name', 'icon')), '04n')) START@56 executing node '('[', ('..', ('(root)', 'rs'), ('',)), ('is', ('..', ('(current)',), ('name', 'icon')), '04n'))' START@56 executing node '('..', ('(root)', 'rs'), ('',))' START@56 executing node '('(root)', 'rs')' DEBUG@290 .. returning '<generator object flatten at 0x103241050>' DEBUG@319 found '('is', ('..', ('(current)',), ('name', 'icon')), '04n')' selector. executing on <generator object flatten at 0x103241050> DEBUG@337 found is operator in selector local variable 'nodeList' referenced before assignment

$..[@.icon is '04n'] START@43 Tree.execute PARSE STAGE ('[', ('..', ('(root)', 'rs'), ('',)), ('is', ('.', ('(current)',), ('name', 'icon')), '04n')) START@56 executing node '('[', ('..', ('(root)', 'rs'), ('',)), ('is', ('.', ('(current)',), ('name', 'icon')), '04n'))' START@56 executing node '('..', ('(root)', 'rs'), ('',))' START@56 executing node '('(root)', 'rs')' DEBUG@290 .. returning '<generator object flatten at 0x1031ddfa0>' DEBUG@319 found '('is', ('.', ('(current)',), ('name', 'icon')), '04n')' selector. executing on <generator object flatten at 0x1031ddfa0> DEBUG@337 found is operator in selector local variable 'nodeList' referenced before assignment

adriank commented 8 years ago

You are using OP v0.5. Switch to newest version from github for your convenience.

I've checked your example with the github version and there is bug indeed. It is probably quick fix, but I don't have spare time ATM to sit down on it. If you wish to dig into the code and fix it, I'm keen to show you around.

Greetings, Adrian Kalbarczyk

http://kalbarczyk.co http://about.me/akalbarczyk

On Sat, Jul 9, 2016 at 10:39 PM, Yun Shen notifications@github.com wrote:

Hmmmmmm, I used -d flag and the output seems to be fine? I listed the failed one first then the successful one below.

$..

[@..icon is '04n'] START@43 Tree.execute PARSE STAGE ('[', ('..', ('(root)', 'rs'), ('',)), ('is', ('..', ('(current)',), ('name', 'icon')), '04n')) START@56 executing node '('[', ('..', ('(root)', 'rs'), (' ',)), ('is', ('..', ('(current)',), ('name', 'icon')), '04n'))' START@56 executing node '('..', ('(root)', 'rs'), ('',))' START@56 executing node '('(root)', 'rs')' DEBUG@290 .. returning '' DEBUG@319 found '('is', ('..', ('(current)',), ('name', 'icon')), '04n')' selector. executing on DEBUG@337 found is operator in selector local variable 'nodeList' referenced before assignment

$..

[@.icon is '04n'] START@43 Tree.execute PARSE STAGE ('[', ('..', ('(root)', 'rs'), ('',)), ('is', ('.', ('(current)',), ('name', 'icon')), '04n')) START@56 executing node '('[', ('..', ('(root)', 'rs'), (' ',)), ('is', ('.', ('(current)',), ('name', 'icon')), '04n'))' START@56 executing node '('..', ('(root)', 'rs'), ('',))' START@56 executing node '('(root)', 'rs')' DEBUG@290 .. returning '' DEBUG@319 found '('is', ('.', ('(current)',), ('name', 'icon')), '04n')' selector. executing on DEBUG@337 found is operator in selector local variable 'nodeList' referenced before assignment

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/adriank/ObjectPath/issues/45#issuecomment-231554741, or mute the thread https://github.com/notifications/unsubscribe/AAKycuvrdEm828nNNs2zBltG9LziHu1Oks5qUAcagaJpZM4JIWm0 .

titaniumrain commented 8 years ago

sure. i am happy to fix the bug if u show me the way. :)

adriank commented 8 years ago

The best way to start testing is to add test in https://github.com/adriank/ObjectPath/blob/master/tests/test_ObjectPath.py. You can add your JSON there.

Then save the JSON as a file on your HDD and run:

python shell.py -d -e "$.*[@..icon is '04n']" temp.json

(don't test with two .. in the same query because it would be harmful for your eyes :) )

You'll get detailed debug strings from whole query execution including line numbers. From brief look at debug output, I presume it's problem with itertools.chain being not properly checked with is operator, because this works:

python shell.py -d -e "$.*[@..speed > 7.30]" temp.json

I would compare outputs of both queries and search for differences.